sudo rpm -Uvh https://packages.microsoft.com/config/rhel/7/packages-microsoft-prod.rpm sudo yum update sudo yum install dotnet-sdk-2.2 dotnet 解压 .NET for Apache Spark包然后配置环境变量 export DOTNET_WORKER_DIR="~/bin/Microsoft.Spark.Worker-0.4.0" dotnet new console -o mySparkApp cd mySparkApp dotnet add package Microsoft.Spark --version 0.4.0 新建一个 input.txt 内容如下 Hello World This .NET app uses .NET for Apache Spark This .NET app counts words with Apache Spark 修改Program.cs 内容 using Microsoft.Spark.Sql; namespace MySparkApp { class Program { static void Main(string[] args) { // Create a Spark session var spark = SparkSession .Builder() .AppName("word_count_sample") .GetOrCreate(); // Create initial DataFrame DataFrame dataFrame = spark.Read().Text("input.txt"); // Count words var words = dataFrame .Select(Functions.Split(Functions.Col("value"), " ").Alias("words")) .Select(Functions.Explode(Functions.Col("words")) .Alias("word")) .GroupBy("word") .Count() .OrderBy(Functions.Col("count").Desc()); // Show results words.Show(); } } } 编译这个程序 dotnet build 运行这个成勋 spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master local bin/Debug/netcoreapp2.2/microsoft-spark-2.4.x-0.4.0.jar dotnet bin/Debug/netcoreapp2.2/mySparkApp.dll