Scala之文件读取、写入、控制台操作的方法示例

2025-01-31 17:07:08

Scala文件读取

E盘根目录下scalaIO.txt文件内容如下：

文件读取示例代码：

 //文件读取
 val file=Source.fromFile("E:\\scalaIO.txt")
 for(line <- file.getLines)
 {
  println(line)
 }
 file.close

说明1：file=Source.fromFile(“E:\scalaIO.txt”)，其中Source中的fromFile()方法源自 import scala.io.Source源码包，源码如下图：

file.getLines(),返回的是一个迭代器-Iterator;源码如下：(scala.io)

Scala 网络资源读取

 //网络资源读取
 val webFile=Source.fromURL("http://spark.apache.org")
 webFile.foreach(print)
 webFile.close()

fromURL()方法源码如下：

 /** same as fromURL(new URL(s))
 */
 def fromURL(s: String)(implicit codec: Codec): BufferedSource =
 fromURL(new URL(s))(codec)

读取的网络资源资源内容如下：

<!DOCTYPE html>
<html lang="en">
<head>
 <meta charset="utf-8">
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
 <meta name="viewport" content="width=device-width, initial-scale=1.0">

 <title>
  Apache Spark™ - Lightning-Fast Cluster Computing

 </title>

 <meta name="description" content="Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.">

 <!-- Bootstrap core CSS -->
 <link href="/css/cerulean.min.css" rel="external nofollow" rel="stylesheet">
 <link href="/css/custom.css" rel="external nofollow" rel="stylesheet">

 <script type="text/javascript">
 <!-- Google Analytics initialization -->
 var _gaq = _gaq || [];
 _gaq.push(['_setAccount', 'UA-32518208-2']);
 _gaq.push(['_trackPageview']);
 (function() {
 var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
 ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
 var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
 })();

 <!-- Adds slight delay to links to allow async reporting -->
 function trackOutboundLink(link, category, action) {
 try {
  _gaq.push(['_trackEvent', category , action]);
 } catch(err){}

 setTimeout(function() {
  document.location.href = link.href;
 }, 100);
 }
 </script>

 <!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
 <!--[if lt IE 9]>
 <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
 <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script>
 <![endif]-->
</head>

<body>

<script src="https://code.jquery.com/jquery.js"></script>
<script src="https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></script>
<script src="/js/lang-tabs.js"></script>
<script src="/js/downloads.js"></script>

<div class="container" style="max-width: 1200px;">

<div class="masthead">

 <p class="lead">
  <a href="/" rel="external nofollow" >
  <img src="/images/spark-logo.png"
  style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px;"></a><span class="tagline">
   Lightning-fast cluster computing
  </span>
 </p>

</div>

<nav class="navbar navbar-default" role="navigation">
 <!-- Brand and toggle get grouped for better mobile display -->
 <div class="navbar-header">
 <button type="button" class="navbar-toggle" data-toggle="collapse"
   data-target="#navbar-collapse-1">
  <span class="sr-only">Toggle navigation</span>
  <span class="icon-bar"></span>
  <span class="icon-bar"></span>
  <span class="icon-bar"></span>
 </button>
 </div>

 <!-- Collect the nav links, forms, and other content for toggling -->
 <div class="collapse navbar-collapse" id="navbar-collapse-1">
 <ul class="nav navbar-nav">
  <li><a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >Download</a></li>
  <li class="dropdown">
  <a href="#" rel="external nofollow" rel="external nofollow" class="dropdown-toggle" data-toggle="dropdown">
   Libraries <b class="caret"></b>
  </a>
  <ul class="dropdown-menu">
   <li><a href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >SQL and DataFrames</a></li>
   <li><a href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >Spark Streaming</a></li>
   <li><a href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >MLlib (machine learning)</a></li>
   <li><a href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >GraphX (graph)</a></li>
   <li class="divider"></li>
   <li><a href="http://spark-packages.org" rel="external nofollow" rel="external nofollow" >Third-Party Packages</a></li>
  </ul>
  </li>
  <li class="dropdown">
  <a href="#" rel="external nofollow" rel="external nofollow" class="dropdown-toggle" data-toggle="dropdown">
   Documentation <b class="caret"></b>
  </a>
  <ul class="dropdown-menu">
   <li><a href="/docs/latest/" rel="external nofollow" >Latest Release (Spark 1.5.1)</a></li>
   <li><a href="/documentation.html" rel="external nofollow" >Other Resources</a></li>
  </ul>
  </li>
  <li><a href="/examples.html" rel="external nofollow" >Examples</a></li>
  <li class="dropdown">
  <a href="/community.html" rel="external nofollow" rel="external nofollow" class="dropdown-toggle" data-toggle="dropdown">
   Community <b class="caret"></b>
  </a>
  <ul class="dropdown-menu">
   <li><a href="/community.html" rel="external nofollow" rel="external nofollow" >Mailing Lists</a></li>
   <li><a href="/community.html#events" rel="external nofollow" >Events and Meetups</a></li>
   <li><a href="/community.html#history" rel="external nofollow" >Project History</a></li>
   <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark" rel="external nofollow" rel="external nofollow" >Powered By</a></li>
   <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Committers" rel="external nofollow" rel="external nofollow" >Project Committers</a></li>
   <li><a href="https://issues.apache.org/jira/browse/SPARK" rel="external nofollow" rel="external nofollow" >Issue Tracker</a></li>
  </ul>
  </li>
  <li><a href="/faq.html" rel="external nofollow" >FAQ</a></li>
 </ul>
 </div>
 <!-- /.navbar-collapse -->
</nav>

<div class="row">
 <div class="col-md-3 col-md-push-9">
 <div class="news" style="margin-bottom: 20px;">
  <h5>Latest News</h5>
  <ul class="list-unstyled">

   <li><a href="/news/submit-talks-to-spark-summit-east-2016.html" rel="external nofollow" >Submission is open for Spark Summit East 2016</a>
   <span class="small">(Oct 14, 2015)</span></li>

   <li><a href="/news/spark-1-5-1-released.html" rel="external nofollow" >Spark 1.5.1 released</a>
   <span class="small">(Oct 02, 2015)</span></li>

   <li><a href="/news/spark-1-5-0-released.html" rel="external nofollow" >Spark 1.5.0 released</a>
   <span class="small">(Sep 09, 2015)</span></li>

   <li><a href="/news/spark-summit-europe-agenda-posted.html" rel="external nofollow" >Spark Summit Europe agenda posted</a>
   <span class="small">(Sep 07, 2015)</span></li>

  </ul>
  <p class="small" style="text-align: right;"><a href="/news/index.html" rel="external nofollow" >Archive</a></p>
 </div>
 <div class="hidden-xs hidden-sm">
  <a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="btn btn-success btn-lg btn-block" style="margin-bottom: 30px;">
  Download Spark
  </a>
  <p style="font-size: 16px; font-weight: 500; color: #555;">
  Built-in Libraries:
  </p>
  <ul class="list-none">
  <li><a href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >SQL and DataFrames</a></li>
  <li><a href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >Spark Streaming</a></li>
  <li><a href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >MLlib (machine learning)</a></li>
  <li><a href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >GraphX (graph)</a></li>
  </ul>
  <a href="http://spark-packages.org" rel="external nofollow" rel="external nofollow" >Third-Party Packages</a>
 </div>
 </div>

 <div class="col-md-9 col-md-pull-3">
 <div class="jumbotron">
 <b>Apache Spark™</b> is a fast and general engine for large-scale data processing.
</div>

<div class="row row-padded">
 <div class="col-md-7 col-sm-7">
 <h2>Speed</h2>

 <p class="lead">
  Run programs up to 100x faster than
  Hadoop MapReduce in memory, or 10x faster on disk.
 </p>

 <p>
  Spark has an advanced DAG execution engine that supports cyclic data flow and
  in-memory computing.
 </p>
 </div>
 <div class="col-md-5 col-sm-5 col-padded-top col-center">
 <div style="width: 100%; max-width: 272px; display: inline-block; text-align: center;">
  <img src="/images/logistic-regression.png" style="width: 100%; max-width: 250px;" />
  <div class="caption" style="min-width: 272px;">Logistic regression in Hadoop and Spark</div>
 </div>
 </div>
</div>

<div class="row row-padded">
 <div class="col-md-7 col-sm-7">
 <h2>Ease of Use</h2>

 <p class="lead">
  Write applications quickly in Java, Scala, Python, R.
 </p>

 <p>
  Spark offers over 80 high-level operators that make it easy to build parallel apps.
  And you can use it <em>interactively</em>
  from the Scala, Python and R shells.
 </p>
 </div>
 <div class="col-md-5 col-sm-5 col-padded-top col-center">
 <div style="text-align: left; display: inline-block;">
  <div class="code">
  text_file = spark.textFile(<span class="string">"hdfs://..."</span>)<br />
   <br />
  text_file.<span class="sparkop">flatMap</span>(<span class="closure">lambda line: line.split()</span>)<br />
      .<span class="sparkop">map</span>(<span class="closure">lambda word: (word, 1)</span>)<br />
      .<span class="sparkop">reduceByKey</span>(<span class="closure">lambda a, b: a+b</span>)
  </div>
  <div class="caption">Word count in Spark's Python API</div>
 </div>
 <!--
 <div class="code" style="margin-top: 20px; text-align: left; display: inline-block;">
  text_file = spark.textFile(<span class="string">"hdfs://..."</span>)<br/>
   <br/>
  text_file.<span class="sparkop">filter</span>(<span class="closure">lambda line: "ERROR" in line</span>)<br/>
      .<span class="sparkop">count</span>()
 </div>
 -->
 <!--<div class="caption">Word count in Spark</div>-->
 </div>
</div>

<div class="row row-padded">
 <div class="col-md-7 col-sm-7">
 <h2>Generality</h2>

 <p class="lead">
  Combine SQL, streaming, and complex analytics.
 </p>

 <p>
  Spark powers a stack of libraries including
  <a href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >SQL and DataFrames</a>, <a href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >MLlib</a> for machine learning,
  <a href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >GraphX</a>, and <a href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >Spark Streaming</a>.
  You can combine these libraries seamlessly in the same application.
 </p>
 </div>
 <div class="col-md-5 col-sm-5 col-padded-top col-center">
 <img src="/images/spark-stack.png" style="margin-top: 15px; width: 100%; max-width: 296px;" usemap="#stack-map" />
 <map name="stack-map">
  <area shape="rect" coords="0,0,74,95" href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="Spark SQL" title="Spark SQL" />
  <area shape="rect" coords="74,0,150,95" href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="Spark Streaming" title="Spark Streaming" />
  <area shape="rect" coords="150,0,224,95" href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="MLlib (machine learning)" title="MLlib" />
  <area shape="rect" coords="225,0,300,95" href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="GraphX" title="GraphX" />
 </map>
 </div>
</div>

<div class="row row-padded" style="margin-bottom: 15px;">
 <div class="col-md-7 col-sm-7">
 <h2>Runs Everywhere</h2>

 <p class="lead">
  Spark runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3.
 </p>

 <p>
  You can run Spark using its <a href="/docs/latest/spark-standalone.html" rel="external nofollow" >standalone cluster mode</a>, on <a href="/docs/latest/ec2-scripts.html" rel="external nofollow" >EC2</a>, on Hadoop YARN, or on <a href="http://mesos.apache.org" rel="external nofollow" >Apache Mesos</a>.
  Access data in <a href="http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html" rel="external nofollow" >HDFS</a>, <a href="http://cassandra.apache.org" rel="external nofollow" >Cassandra</a>, <a href="http://hbase.apache.org" rel="external nofollow" >HBase</a>,
  <a href="http://hive.apache.org" rel="external nofollow" >Hive</a>, <a href="http://tachyon-project.org" rel="external nofollow" >Tachyon</a>, and any Hadoop data source.
 </p>
 </div>
 <div class="col-md-5 col-sm-5 col-padded-top col-center">
 <img src="/images/spark-runs-everywhere.png" style="width: 100%; max-width: 280px;" />
 </div>
</div>

 </div>
</div>

<div class="row">
 <div class="col-md-4 col-padded">
 <h3>Community</h3>

 <p>
  Spark is used at a wide range of organizations to process large datasets.
  You can find example use cases at the <a href="http://spark-summit.org/summit-2013/" rel="external nofollow" >Spark Summit</a>
  conference, or on the
  <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark" rel="external nofollow" rel="external nofollow" >Powered By</a>
  page.
 </p>

 <p>
  There are many ways to reach the community:
 </p>
 <ul class="list-narrow">
  <li>Use the <a href="/community.html#mailing-lists" rel="external nofollow" >mailing lists</a> to ask questions.</li>
  <li>In-person events include the <a href="http://www.meetup.com/spark-users/" rel="external nofollow" >Bay Area Spark meetup</a> and
  <a href="http://spark-summit.org/" rel="external nofollow" >Spark Summit</a>.</li>
  <li>We use <a href="https://issues.apache.org/jira/browse/SPARK" rel="external nofollow" rel="external nofollow" >JIRA</a> for issue tracking.</li>
 </ul>
 </div>

 <div class="col-md-4 col-padded">
 <h3>Contributors</h3>

 <p>
  Apache Spark is built by a wide set of developers from over 200 companies.
  Since 2009, more than 800 developers have contributed to Spark!
 </p>

 <p>
  The project's
  <a href="https://cwiki.apache.org/confluence/display/SPARK/Committers" rel="external nofollow" rel="external nofollow" >committers</a>
  come from 16 organizations.
 </p>

 <p>
  If you'd like to participate in Spark, or contribute to the libraries on top of it, learn
  <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark" rel="external nofollow" >how to
  contribute</a>.
 </p>
 </div>

 <div class="col-md-4 col-padded">
 <h3>Getting Started</h3>

 <p>Learning Spark is easy whether you come from a Java or Python background:</p>
 <ul class="list-narrow">
  <li><a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >Download</a> the latest release — you can run Spark locally on your laptop.</li>
  <li>Read the <a href="/docs/latest/quick-start.html" rel="external nofollow" >quick start guide</a>.</li>
  <li>
  Spark Summit 2014 contained free <a href="http://spark-summit.org/2014/training" rel="external nofollow" >training videos and exercises</a>.
  </li>
  <li>Learn how to <a href="/docs/latest/#launching-on-a-cluster" rel="external nofollow" >deploy</a> Spark on a cluster.</li>
 </ul>
 </div>
</div>

<div class="row">
 <div class="col-sm-12 col-center">
 <a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="btn btn-success btn-lg" style="width: 262px;">Download Spark</a>
 </div>
</div>

<footer class="small">
 <hr>
 Apache Spark, Spark, Apache, and the Spark logo are trademarks of
 <a href="http://www.apache.org" rel="external nofollow" >The Apache Software Foundation</a>.
</footer>

</div>

</body>
</html>

Process finished with exit code 0

 //网络资源读取
 val webFile=Source.fromURL("http://www.baidu.com/")
 webFile.foreach(print)
 webFile.close()

读取中文资源站点，出现编码混乱问题如下：（解决办法自行解决，本文不是重点）

Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持我们。

scala中常用特殊符号详解

=>(匿名函数) => 匿名函数,在Spark中函数也是一个对象可以赋值给一个变量. Spark的匿名函数定义格式: (形参列表) => {函数体} 所以,=>的作用就是创建一个匿名函数实例. 比如:(x:Int) => x +1 ,就等同于下面的Java方法: public int function(int x) { return x+1; } 示例: class Symbol { var add = (x: Int) => x + 1 } object test2
scala中的隐式类型转换的实现

Scala语言中的隐式转换是一个十分强大的语言特性,主要可以起到两个作用: 一.自动进行某些数据类型的隐式转换 String类型是不能自动转换为Int类型的,所以当给一个Int类型的变量或常量赋予String类型的值时编译器将报错.所以,一下语句是错误的. val x: Int = "100" 如果需要将一个字符串类型的整形数值赋给Int,比如使用String.toInt方法,例如: val x: Int = "100".toInt 如果想让字符串自动转换为整形,就
使用Scala生成随机数的方法示例

一.使用Scala生成随机数 1.简单版本: /* 1.you can use scala.util.Random.nextInt(10) to produce a number between 1 and 10 2.at the same time,you nextInt(100) to produce a number between 1 and 100 */ object Test { def main(args: Array[String]) { var i = 0 while(i <
Scala入门之List使用详解

Scala中使用List Scala是函数式风格与面向对象共存的编程语言,方法不应该有副作用是函数风格编程的一个重要的理念.方法唯一的效果应该是计算并返回值,用这种方式工作的好处就是方法之间很少纠缠在一起,因此就更加可靠和可重用.另一个好处(静态类型语言)是传入传出方法的所有东西都被类型检查器检查,因此逻辑错误会更有可能把自己表现为类型错误.把这个函数式编程的哲学应用到对象世界里以为着使对象不可变. 前面一章介绍的Array数组是一个所有对象都共享相同类型的可变序列.比方说Array[Strin
IntelliJ IDEA下Maven创建Scala项目的方法步骤

环境:IntelliJ IDEA 版本:Spark-2.2.1 Scala-2.11.0 利用 Maven 第一次创建 Scala 项目也遇到了许多坑创建一个 Scala 的 WordCount 程序第一步:IntelliJ IDEA下安装 Scala 插件安装完 Scala 插件完成第二步:Maven 下 Scala 下的项目创建正常创建 Maven 项目(不会的看另一篇 Maven 配置) 第三步:Scala 版本的下载及配置通过Spark官网下载页面http://spark.a
Scala中的mkString的具体方法

1.mkString()方法的使用: mkString(seq:String)方法是将原字符串使用特定的字符串seq分割. mkString(statrt:String,seq:String,end:String)方法是将原字符串使用特定的字符串seq分割的同时,在原字符串之前添加字符串start,在其后添加字符串end. object Test { def main(args: Array[String]): Unit = { var name : String = "Hello Little
scala当中的文件操作和网络请求的实现方法

1.读取文件当中每一行的数据 def main(args: Array[String]): Unit = { //注意文件的编码格式,如果编码格式不对,那么读取报错 val file: BufferedSource = Source.fromFile("F:\\files\\file.txt","GBK"); val lines: Iterator[String] = file.getLines() for(line <- lines){ println(li
Scala求和示例代码

Scala 是一门多范式(multi-paradigm)的编程语言,设计初衷是要集成面向对象编程和函数式编程的各种特性. Scala 运行在Java虚拟机上,并兼容现有的Java程序. Scala 源代码被编译成Java字节码,所以它可以运行于JVM之上,并可以调用现有的Java类库. def sum(f: Int => Int)(a: Int)(b: Int): Int = { @annotation.tailrec def loop(n: Int, acc: Int): Int = { if
Scala实现冒泡排序、归并排序和快速排序的示例代码

1.冒泡排序 def sort(list: List[Int]): List[Int] = list match { case List() => List() case head :: tail => compute(head, sort(tail)) } def compute(data: Int, dataSet: List[Int]): List[Int] = dataSet match { case List() => List(data) case head :: tail
Scala的文件读写操作与正则表达式

目录在本篇文章中你将会学习并了解常用的文件处理任务,例如读取文件的一行文本,本博客的要点包含: 1.Source.fromFile(...).getLines.toArray 输出文件所有行 2.Source.fromFile(...).mkString 以字符串形式输出文件内容 3.将字符串转换为数字,可以使用toInt或toDouble方法 4.使用java的PrintWriter写入文本文件 5."正则".r是一个Regex对象 6.若你的正则表达式包含反斜杠或者引号,请用&q
浅析scala中map与flatMap的区别

在函数式语言中,函数作为一等公民,可以在任何地方定义,在函数内或函数外,可以作为函数的参数和返回值,可以对函数进行组合.由于命令式编程语言也可以通过类似函数指针的方式来实现高阶函数,函数式的最主要的好处主要是不可变性带来的.没有可变的状态,函数就是引用透明(Referential transparency)的和没有副作用(No Side Effect). 任何一种函数式语言中,都有map函数与faltMap这两个函数,比如python虽然不是纯函数式语言,也有这两个函数.再比如在jdk1.8之后

Scala之文件读取、写入、控制台操作的方法示例

相关推荐

随机推荐