RMySQL

RMySQL數據庫編程指南

R的極客理想系列文章,涵蓋了R的思想,使用,工具,創新等的一系列要點,以我個人的學習和體驗去詮釋R的強大。

R語言作爲統計學一門語言,一直在小衆領域閃耀着光芒。直到大數據的爆發,R語言變成了一門炙手可熱的數據分析的利器。隨着越來越多的工程背景的人的加入,R語言的社區在迅速擴大成長。現在已不僅僅是統計領域,教育,銀行,電商,互聯網….都在使用R語言。

要成爲有理想的極客,我們不能停留在語法上,要掌握牢固的數學,概率,統計知識,同時還要有創新精神,把R語言發揮到各個領域。讓我們一起動起來吧,開始R的極客理想。

關於作者:

轉載請註明出處:
http://blog.fens.me/r-mysql-rmysql/

r-rmysql

前言

MySQL是一款最常用到開源數據庫軟件,安裝簡單,運行穩定,非常適用於中小型的數據存儲。R作爲數據分析的工具,當然要支持數據庫驅動接口。讓R和MySQL配合在一起,所能爆發出的能量是巨大的。

由於操作系統的原因,讓Win和Linux有不一樣的字符集,不一樣的運行時環境。所以,今天我們講一下如何在Linux和Win上面安裝和使用RMySQL。

目錄

  1. RMySQL介紹
  2. RMySQL在Linux下安裝
  3. RMySQL在Win7下安裝
  4. RMySQL函數使用
  5. RMySQL案例實踐

1. RMySQL介紹

RMySQL一個R語言程序包,提供了訪問MySQL數據庫的R語言接口程序,RMySQL需求依賴於DBI項目。RMySQL不僅提供了基本的數據庫訪問,SQL查詢,還封裝了一些方法。比較讀整表,分頁,data.frame快速插入等等的功能。掌握好RMySQL,數據庫編輯將得心應手!!

2. RMySQL在Linux下安裝

Linux系統環境:

  • Linux: Ubuntu 12.04.2 LTS 64bit server
  • Linux字符集: en_US.UTF-8
  • R: 3.0.1, x86_64-pc-linux-gnu (64-bit)
  • MySQL: Ver 14.14 Distrib 5.5.29 64bit server
  • MySQL字符集: utf8

~ uname -a
Linux conan 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

~ cat /etc/issue
Ubuntu 12.04.2 LTS \n \l

~ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

~ R --version
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
http://www.gnu.org/licenses/.

~ mysql --version
mysql  Ver 14.14 Distrib 5.5.29, for debian-linux-gnu (x86_64) using readline 6.2

mysql> show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

在R環境中安裝RMySQL


~ R
> install.packages('RMySQL')

also installing the dependency ‘DBI’

trying URL 'http://cran.dataguru.cn/src/contrib/DBI_0.2-7.tar.gz'
Content type 'application/x-gzip' length 194699 bytes (190 Kb)
opened URL
==================================================
downloaded 190 Kb

trying URL 'http://cran.dataguru.cn/src/contrib/RMySQL_0.9-3.tar.gz'
Content type 'application/x-gzip' length 165363 bytes (161 Kb)
opened URL
==================================================
downloaded 161 Kb

...

Configuration error:
  could not find the MySQL installation include and/or library
  directories.  Manually specify the location of the MySQL
  libraries and the header files and re-run R CMD INSTALL.

INSTRUCTIONS:

1. Define and export the 2 shell variables PKG_CPPFLAGS and
   PKG_LIBS to include the directory for header files (*.h)
   and libraries, for example (using Bourne shell syntax):

      export PKG_CPPFLAGS="-I"
      export PKG_LIBS="-L -lmysqlclient"

   Re-run the R INSTALL command:

      R CMD INSTALL RMySQL_.tar.gz

2. Alternatively, you may pass the configure arguments
      --with-mysql-dir= (distribution directory)
   or
      --with-mysql-inc= (where MySQL header files reside)
      --with-mysql-lib= (where MySQL libraries reside)
   in the call to R INSTALL --configure-args='...'

   R CMD INSTALL --configure-args='--with-mysql-dir=DIR' RMySQL_.tar.gz

ERROR: configuration failed for package ‘RMySQL’
* removing ‘/home/conan/R/x86_64-pc-linux-gnu-library/3.0/RMySQL’

The downloaded source packages are in
        ‘/tmp/Rtmpu0Gn88/downloaded_packages’
Warning message:
In install.packages("RMySQL") :
  installation of package ‘RMySQL’ had non-zero exit status

安裝出錯了,提示我們需要增加MySQL安裝目錄的配置參數


# 安裝mysql類庫 
~ sudo apt-get install libdbd-mysql libmysqlclient-dev

# 找到mysql的安裝目錄
~ whereis mysql
mysql: /usr/bin/mysql /etc/mysql /usr/lib/mysql /usr/bin/X11/mysql /usr/share/mysql /usr/share/man/man1/mysql.1.gz

# 找到剛剛下載的RMySQL_.tar.gz
~ ls /tmp/Rtmpu0Gn88/downloaded_packages
DBI_0.2-7.tar.gz  RMySQL_0.9-3.tar.gz

# 通過命令安裝RMySQL
~ R CMD INSTALL --configure-args='--with-mysql-dir=/usr/lib/mysql' /tmp/Rtmpu0Gn88/downloaded_packages/RMySQL_0.9-3.tar.gz

* installing to library ‘/home/conan/R/x86_64-pc-linux-gnu-library/3.0’
* installing *source* package ‘RMySQL’ ...
** package ‘RMySQL’ successfully unpacked and MD5 sums checked
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ANSI C... none needed
checking how to run the C preprocessor... gcc -E
checking for compress in -lz... yes
checking for getopt_long in -lc... yes
checking for mysql_init in -lmysqlclient... yes
checking for egrep... grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking mysql.h usability... no
checking mysql.h presence... no
checking for mysql.h... no
checking /usr/local/include/mysql/mysql.h usability... no
checking /usr/local/include/mysql/mysql.h presence... no
checking for /usr/local/include/mysql/mysql.h... no
checking /usr/include/mysql/mysql.h usability... yes
checking /usr/include/mysql/mysql.h presence... yes
checking for /usr/include/mysql/mysql.h... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -I/usr/include/mysql     -fpic  -O3 -pipe  -g  -c RS-DBI.c -o RS-DBI.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -I/usr/include/mysql     -fpic  -O3 -pipe  -g  -c RS-MySQL.c -o RS-MySQL.o
gcc -std=gnu99 -shared -o RMySQL.so RS-DBI.o RS-MySQL.o -lmysqlclient -lz -L/usr/lib/R/lib -lR
installing to /home/conan/R/x86_64-pc-linux-gnu-library/3.0/RMySQL/libs
** R
** inst
** preparing package for lazy loading
Creating a generic function for ‘format’ from package ‘base’ in package ‘RMySQL’
Creating a generic function for ‘print’ from package ‘base’ in package ‘RMySQL’
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (RMySQL)

RMySQL安裝成功.

在MySQL中建庫建表


~ mysql -uroot -p

mysql> create database rmysql;
Query OK, 1 row affected (0.00 sec)

mysql> grant all on rmysql.* to rmysql@'%' identified by 'rmysql';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on rmysql.* to rmysql@localhost identified by 'rmysql';
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql
Database changed

mysql> CREATE TABLE t_user(
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO t_user(user) values('A1'),('AB'),('fens.me');
Query OK, 3 rows affected (0.04 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.00 sec)

通過R程序,讀MySQL數據庫數據


~ R

> library(RMySQL)
Loading required package: DBI

> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql")
> users = dbGetQuery(conn, "SELECT * FROM t_user")
> dbDisconnect(conn)
[1] TRUE
> users
  id    user
1  1      A1
2  2      AB
3  3 fens.me

好了,我們實現了在Linux下R和MySQL的連接。

3. RMySQL在Win7下安裝

Win系統環境:

  • Win7: 64位 旗艦版
  • Win字符集: gbk,utf8
  • R: 3.0.1, x86_64-w64-mingw32/x64 (64-bit)
  • MySQL: mysql Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)

~ R --version
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
http://www.gnu.org/licenses/.

~ mysql --version
mysql  Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)

mysql> show variables like '%char%';
+--------------------------+------------------------------------+
| Variable_name            | Value                              |
+--------------------------+------------------------------------+
| character_set_client     | gbk                                |
| character_set_connection | gbk                                |
| character_set_database   | utf8                               |
| character_set_filesystem | binary                             |
| character_set_results    | gbk                                |
| character_set_server     | utf8                               |
| character_set_system     | utf8                               |
| character_sets_dir       | D:\toolkit\mysql56\share\charsets\ |
+--------------------------+------------------------------------+
8 rows in set (0.07 sec)

在R環境中安裝RMySQL


~ D:\workspace\R\mysql>R

> install.packages('RMySQl')
package 'RMySQl' is not available (for R version 3.0.1)

我們看到提示,沒有對應的RMySQL安裝版本。


# 下載RMySQL源代碼包
> install.packages("RMySQL", type="source")
URLhttp://cran.dataguru.cn/src/contrib/RMySQL_0.9-3.tar.gz'
Content type 'application/x-gzip' length 165363 bytes (161 Kb)
URL
downloaded 161 Kb

* installing *source* package 'RMySQL' ...
** 'RMySQL'MD5
checking for $MYSQL_HOME... not found... searching registry...

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
readRegistry("SOFTWARE\\MySQL AB", hive = "HLM", maxdepth = 2) :
  Registry key 'SOFTWARE\MySQL AB' not found

ERROR: configuration failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'
        'C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages'

In install.packages("RMySQL", type = "source") :
  'RMySQL'B0

找到源代碼包:RMySQL_0.9-3.tar.gz


~ dir C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages
2013-09-24  13:16           165,363 RMySQL_0.9-3.tar.gz

通過源代碼包安裝


~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3
.tar.gz
* installing to library 'C:/Program Files/R/R-3.0.1/library'
* installing *source* package 'RMySQL' ...
** 'RMySQL'MD5
checking for $MYSQL_HOME... not found... searching registry...

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
readRegistry("SOFTWARE\\MySQL AB", hive = "HLM", maxdepth = 2) :
  Registry key 'SOFTWARE\MySQL AB' not found

ERROR: configuration failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'

設置MYSQL_HOME的環境變量


set MYSQL_HOME=D:\toolkit\mysql56

注: MYSQL_HOME建議設置在系統環境變量中。

再一次安裝RMySQL


D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3
.tar.gz
* installing to library 'C:/Program Files/R/R-3.0.1/library'
* installing *source* package 'RMySQL' ...
** 'RMySQL'MD5
checking for $MYSQL_HOME... D:\toolkit\mysql56
cygwin warning:
  MS-DOS style path detected: D:\toolkit\mysql56
  Preferred POSIX equivalent is: /cygdrive/d/toolkit/mysql56
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
** libs
: this package has a non-empty 'configure.win' file,
so building only the main architecture

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-DBI.c -o RS-DBI.o
RS-DBI.c: In function 'RS_na_set':
RS-DBI.c:1219:11: warning: variable 'c' set but not used [-Wunused-but-set-variable]
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-MySQL.c -o RS-MySQL.o
RS-MySQL.c: In function 'RS_MySQL_fetch':
RS-MySQL.c:657:13: warning: variable 'fld_nullOk' set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function 'RS_DBI_invokeBeginGroup':
RS-MySQL.c:1137:30: warning: variable 'val' set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function 'RS_DBI_invokeNewRecord':
RS-MySQL.c:1158:20: warning: variable 'val' set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function 'RS_MySQL_dbApply':
RS-MySQL.c:1219:38: warning: variable 'fld_nullOk' set but not used [-Wunused-but-set-variable]
gcc -m64 -shared -s -static-libgcc -o RMySQL.dll tmp.def RS-DBI.o RS-MySQL.o D:\toolkit\mysql56/bin/libmySQL.dll -Ld:/RC
ompile/CRANpkg/extralibs64/local/lib/x64 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/PROGRA~1/R/R-30~1.1/bin/x64 -l
R
gcc.exe: error: D:\toolkit\mysql56/bin/libmySQL.dll: No such file or directory
ERROR: compilation failed for package 'RMySQL'
* removing 'C:/Program Files/R/R-3.0.1/library/RMySQL'

錯誤爲沒有找到動態鏈接庫:D:\toolkit\mysql56/bin/libmySQL.dll


# 複製動態鏈接庫libmySQL.dll
cp D:\toolkit\mysql56\lib\libmysql.dll D:\toolkit\mysql56\bin\
mv D:\toolkit\mysql56\bin\libmysql.dll D:\toolkit\mysql56\bin\libmySQL.dll

再一次安裝RMySQL


~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3
.tar.gz
* installing to library 'C:/Program Files/R/R-3.0.1/library'
* installing *source* package 'RMySQL' ...
** 'RMySQL'MD5
checking for $MYSQL_HOME... D:\toolkit\mysql56
cygwin warning:
  MS-DOS style path detected: D:\toolkit\mysql56
  Preferred POSIX equivalent is: /cygdrive/d/toolkit/mysql56
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
** libs
: this package has a non-empty 'configure.win' file,
so building only the main architecture

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-DBI.c -o RS-DBI.o
RS-DBI.c: In function 'RS_na_set':
RS-DBI.c:1219:11: warning: variable 'c' set but not used [-Wunused-but-set-variable]
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-MySQL.c -o RS-MySQL.o
RS-MySQL.c: In function 'RS_MySQL_fetch':
RS-MySQL.c:657:13: warning: variable 'fld_nullOk' set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function 'RS_DBI_invokeBeginGroup':
RS-MySQL.c:1137:30: warning: variable 'val' set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function 'RS_DBI_invokeNewRecord':
RS-MySQL.c:1158:20: warning: variable 'val' set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function 'RS_MySQL_dbApply':
RS-MySQL.c:1219:38: warning: variable 'fld_nullOk' set but not used [-Wunused-but-set-variable]
gcc -m64 -shared -s -static-libgcc -o RMySQL.dll tmp.def RS-DBI.o RS-MySQL.o D:\toolkit\mysql56/bin/libmySQL.dll -Ld:/RC
ompile/CRANpkg/extralibs64/local/lib/x64 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/PROGRA~1/R/R-30~1.1/bin/x64 -l
R
installing to C:/Program Files/R/R-3.0.1/library/RMySQL/libs/x64
** R
** inst
** preparing package for lazy loading
Creating a generic function for 'format' from package 'base' in package 'RMySQL'
Creating a generic function for 'print' from package 'base' in package 'RMySQL'
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
MYSQL_HOME defined as D:\toolkit\mysql56
* DONE (RMySQL)

安裝成功!

在MySQL中建庫建表


~ mysql -uroot -p

mysql> create database rmysql;
Query OK, 1 row affected (0.04 sec)

mysql> grant all on rmysql.* to rmysql@'%' identified by 'rmysql';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on rmysql.* to rmysql@localhost identified by 'rmysql';
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql
Database changed
mysql> CREATE TABLE t_user(
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (1.01 sec)

mysql>
mysql> INSERT INTO t_user(user) values('A1'),('AB'),('fens.me');
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.03 sec)

通過R程序,讀MySQL數據庫數據。

注:如果剛纔沒有把MYSQL_HOME的變量寫到環境變更中,每次在啓動R之前要,先設置變量。


~ set MYSQL_HOME=D:\toolkit\mysql56
~ R

> library(RMySQL)
DBI
MYSQL_HOME defined as D:\toolkit\mysql56

> conn <- dbConnect(MySQL(), dbname = "rmysql", username="root", password="",client.flag=CLIENT_MULTI_STATEMENTS)
> users = dbGetQuery(conn, "SELECT * FROM t_user")
> dbDisconnect(conn)
[1] TRUE
> users
  id    user
1  1      A1
2  2      AB
3  3 fens.me

好了,我們實現了在Win7下R和MySQL的連接。

4. RMySQL函數使用

環境都安裝好了,接下來我們具體使用一下RMySQL的包。

  • RMySQL輔助操作
  • RMySQL數據庫操作
  • 針對win的字符集設置

1). RMySQL輔助操作

加載類庫

> library(RMySQL)

建立本地連接


> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",client.flag=CLIENT_MULTI_STATEMENTS)

建立遠程連接


> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",host="192.168.1.201",port=3306)

關閉連接

dbDisconnect(conn)

查看數據庫的表


> dbListTables(conn)
[1] "t_user"

查看錶的字段


> dbListFields(conn, "t_user")
[1] "id"   "user"

查詢MySQL信息


> summary(MySQL(), verbose = TRUE)
<MySQLDriver:(23864)> 
  Driver name:  MySQL 
  Max  connections: 16 
  Conn. processed: 3 
  Default records per fetch: 500 
  DBI API version:  

# MySQL連接實例信息
> summary(conn, verbose = TRUE)
<MySQLConnection:(23864,2)> 
  User: root 
  Host: localhost 
  Dbname: rmysql 
  Connection type: localhost via TCP/IP 
  MySQL server version:  5.6.11 
  MySQL client version:  5.6.11 
  MySQL protocol version:  10 
  MySQL server thread id:  35 
  No resultSet available

# MySQL連接信息
> dbListConnections(MySQL())
[[1]]
<MySQLConnection:(23864,2)> 

2). RMySQL數據庫操作
RMySQL數據庫操作


# 建表並插入數據
> t_demo<-data.frame(
  a=seq(1:10),
  b=letters[1:10],
  c=rnorm(10)
)
> dbWriteTable(conn, "t_demo", t_demo)

# 獲得整個表數據
> dbReadTable(conn, "t_demo")
    a b           c
1   1 a  0.98868164
2   2 b -0.66935770
3   3 c  0.27703638
4   4 d  1.36137156
5   5 e -0.70291017
6   6 f  1.61235088
7   7 g  0.17616068
8   8 h  0.29700017
9   9 i  0.19032719
10 10 j -0.06222173

# 插入新數據
> dbWriteTable(conn, "t_demo", t_demo, append=TRUE)
> dbReadTable(conn, "t_demo")
   row_names  a b           c
1          1  1 a  0.98868164
2          2  2 b -0.66935770
3          3  3 c  0.27703638
4          4  4 d  1.36137156
5          5  5 e -0.70291017
6          6  6 f  1.61235088
7          7  7 g  0.17616068
8          8  8 h  0.29700017
9          9  9 i  0.19032719
10        10 10 j -0.06222173
11         1  1 a  0.98868164
12         2  2 b -0.66935770
13         3  3 c  0.27703638
14         4  4 d  1.36137156
15         5  5 e -0.70291017
16         6  6 f  1.61235088
17         7  7 g  0.17616068
18         8  8 h  0.29700017
19         9  9 i  0.19032719
20        10 10 j -0.06222173

# 覆蓋原表數據
> dbWriteTable(conn, "t_demo", t_demo, overwrite=TRUE)

# 1). 查詢數據
> d0 <- dbGetQuery(conn, "SELECT * FROM t_demo where c>0")
> class(d0)
[1] "data.frame"

> d0
  row_names a b         c
1         1 1 a 0.9886816
2         3 3 c 0.2770364
3         4 4 d 1.3613716
4         6 6 f 1.6123509
5         7 7 g 0.1761607
6         8 8 h 0.2970002
7         9 9 i 0.1903272

# 2). 執行SQL腳本查詢,並分頁
> rs <- dbSendQuery(conn, "SELECT * FROM t_demo where c>0")
> class(rs)
[1] "MySQLResult"
attr(,"package")
[1] "RMySQL"
> mysqlCloseResult(rs)
[1] TRUE

> d1 <- fetch(rs, n = 3)
> d1
  row_names a b         c
1         1 1 a 0.9886816
2         3 3 c 0.2770364
3         4 4 d 1.3613716

# 3). 查看集統計信息
> summary(rs, verbose = TRUE)
  row_names               a              b                   c         
 Length:7           Min.   :1.000   Length:7           Min.   :0.1762  
 Class :character   1st Qu.:3.500   Class :character   1st Qu.:0.2337  
 Mode  :character   Median :6.000   Mode  :character   Median :0.2970  
                    Mean   :5.429                      Mean   :0.7004  
                    3rd Qu.:7.500                      3rd Qu.:1.1750  
                    Max.   :9.000                      Max.   :1.6124

# 不插入row.names字段
> dbWriteTable(conn, "t_demo", t_demo,row.names=FALSE,overwrite=TRUE)
> dbGetQuery(conn, "SELECT * FROM t_demo where c>0")
  a b         c
1 1 a 0.9886816
2 3 c 0.2770364
3 4 d 1.3613716
4 6 f 1.6123509
5 7 g 0.1761607
6 8 h 0.2970002
7 9 i 0.1903272

# 刪除表
> if(dbExistsTable(conn,'t_demo')){
+     dbRemoveTable(conn, "t_demo")
+ }
[1] TRUE

執行SQL語句,dbSendQuery


> query<-dbSendQuery(conn, "show tables")
> data <- fetch(query, n = -1)
> data
  Tables_in_rmysql
1           t_demo
2           t_user
> mysqlCloseResult(query)
[1] TRUE

4). win的字符集設置
在win7中,向MySQL插入中文


mysql> INSERT INTO t_user(user) values('小朋友'),('你好'),('正確了');
Query OK, 3 rows affected (0.07 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
|  5 | 你好    |
|  4 | 小朋友  |
|  6 | 正確了  |
+----+---------+
6 rows in set (0.07 sec)

通過RMySQL查詢


> dbGetQuery(conn, "SELECT * FROM t_user")
  id    user
1  1      A1
2  2      AB
3  3 fens.me
4  5      ??
5  4     ???
6  6     ???

設置GKB字符集


> dbDisconnect(conn)
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="root", password="",client.flag=CLIENT_MULTI_STATEMENTS)
> dbSendQuery(conn,'SET NAMES gbk')
 
> query<-dbSendQuery(conn, "SELECT * FROM t_user")
> data <- fetch(query, n = -1)
> mysqlCloseResult(query)
[1] TRUE
> data
  id    user
1  1      A1
2  2      AB
3  3 fens.me
4  5    你好
5  4  小朋友
6  6  正確了

OK,我們在win下面修正字符編號的問題。

5. RMySQL案例實踐

系統需求描述:Linux MySQL,Win7的R環境,遠程連接

  • 1. 通過SQL新建表t_blog,主鍵索引,唯一鍵索引
  • 2. 用RMySQL插入數據,包括中文字段
  • 3. 再用RMySQL取出數據

1). 通過SQL新建表t_blog,主鍵索引,唯一鍵索引
建表語句


CREATE TABLE t_blog(
id INT PRIMARY KEY AUTO_INCREMENT,
title varchar(12) NOT NULL UNIQUE,
author varchar(12) NOT NULL, 
length int NOT NULL,
create_date timestamp NOT NULL DEFAULT now()
)ENGINE=INNODB DEFAULT CHARSET=UTF8;

mysql> desc t_blog;
+-------------+-------------+------+-----+-------------------+----------------+
| Field       | Type        | Null | Key | Default           | Extra          |
+-------------+-------------+------+-----+-------------------+----------------+
| id          | int(11)     | NO   | PRI | NULL              | auto_increment |
| title       | varchar(12) | NO   | UNI | NULL              |                |
| author      | varchar(12) | NO   |     | NULL              |                |
| length      | int(11)     | NO   |     | NULL              |                |
| create_date | timestamp   | NO   |     | CURRENT_TIMESTAMP |                |
+-------------+-------------+------+-----+-------------------+----------------+
5 rows in set (0.00 sec)

mysql> show indexes from t_blog;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_blog |          0 | PRIMARY  |            1 | id          | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
| t_blog |          0 | title    |            1 | title       | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)


INSERT INTO t_blog(title,author,length) values('你好,第一篇','Conan',20),('RMySQL數據庫編程','Conan',99),('R的極客理想系列文章','Conan',15);

mysql> select * from t_blog;
+----+------------------------------+--------+--------+---------------------+
| id | title                        | author | length | create_date         |
+----+------------------------------+--------+--------+---------------------+
|  1 | 你好,第一篇                 | Conan  |     20 | 2013-08-15 00:13:13 |
|  2 | RMySQL數據庫編程             | Conan  |     99 | 2013-08-15 00:13:13 |
|  3 | R的極客理想系列文章          | Conan  |     15 | 2013-08-15 00:13:13 |
+----+------------------------------+--------+--------+---------------------+
3 rows in set (0.00 sec)

2). 用RMySQL插入數據,包括中文字段,再取出數據


> library(RMySQL)
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",host="192.168.1.201",port=3306)
> 
> dbSendQuery(conn,'SET NAMES gbk')
 
> dbSendQuery(conn,"INSERT INTO t_blog(title,author,length) values('R插入的新文章','Conan',50)");
 
> 
> query<-dbSendQuery(conn, "SELECT * FROM t_blog")
Warning message:
In mysqlExecStatement(conn, statement, ...) :
  RS-DBI driver warning: (unrecognized MySQL field type 7 in column 4 imported as character)
> data <- fetch(query, n = -1)
> mysqlCloseResult(query)
[1] TRUE
> print(data)
  id               title author length         create_date
1  1        你好,第一篇  Conan     20 2013-08-15 00:13:13
2  2    RMySQL數據庫編程  Conan     99 2013-08-15 00:13:13
3  3 R的極客理想系列文章  Conan     15 2013-08-15 00:13:13
4  4       R插入的新文章  Conan     50 2013-08-15 00:29:45
> 
> dbDisconnect(conn)
[1] TRUE

特別提示,不能用dbWriteTable函數!!

我們已經完成,掌握了RMySQL的各種使用技巧,希望大家理解原理後,能少犯錯誤,提高工作效率!

轉載請註明出處:
http://blog.fens.me/r-mysql-rmysql/

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章