實驗環境
Cloudera manager6.3;
相關報錯
筆者使用Cloudera Manager6.3
來進行管理,當打開spark-shell
交互式終端,讀取mysql
數據庫中的數據時出現如下報錯:
scala> val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysqll://hadoop210:3306/rdd").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "t").option("user", "root").option("password", "000000").load()
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$5.apply(JDBCOptions.scala:99)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:99)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:317)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
... 49 elided
錯誤分析
根據錯誤提示,很容易看出是缺少Mysql
驅動,於是!我們只需下載mysql-connector-java-5.1.26-bin.jar
(下載地址)將其放到spark
的類路徑下(若是採用獨立模式和本地模式,則相應的目錄爲 …/spark/jar)即可。
但是若是Cloudera Manger
來管理,那麼該如何做呢?
方法是一樣的,只是目錄變了,找到/opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/jars
,在命令行將mysql-connector-java-5.1.26-bin.jar
複製過來。
sudo cp ./mysql-connector-java.jar /opt/cloudera/parcels/CDH-6.3.0-1.cdh6.3.0.p0.1279813/lib/spark/jars
下來重新開啓終端,便可以讀取mysql
中的數據了。