Python服务器运维笔记:第一章数据库精讲 - 1.1.10 联合查询

前言:本文是学习网易微专业的《python全栈工程师》 中的《服务器运维开发工程师》专题的课程笔记,欢迎学习交流。同时感谢老师们的精彩传授!

一、课程目标

  • 内连接
  • 外连接
  • 子查询
  • 记录联合查询

二、详情解读

2.1.数据基础
2.1.1.表结构

准备一个用户表与文章表
在这里插入图片描述
执行下面程序批量生成数据:

from mysql.connector import pooling
import random
from datetime import datetime, timedelta
from concurrent.futures import ThreadPoolExecutor
from pymysql import escape_string

cnxpool =pooling.MySQLConnectionPool(pool_name="mypool", pool_size=30,
                                                         user='root', password='root',
                                                         host='localhost',database='mycms')

################
# 文章内容
article_template = '''
What’s New In Python 3.8
Editor
Raymond Hettinger
This article explains the new features in Python 3.8, compared to 3.7. For full details, see the changelog.
Python 3.8 was released on October 14th, 2019.
Summary – Release highlights
New Features
Assignment expressions
There is new syntax := that assigns values to variables as part of a larger expression. It is affectionately known as “the walrus operator” due to its resemblance to the eyes and tusks of a walrus.
In this example, the assignment expression helps avoid calling len() twice:
if (n := len(a)) > 10: print(f"List is too long ({n} elements, expected <= 10)")
A similar benefit arises during regular expression matching where match objects are needed twice, once to test whether a match occurred and another to extract a subgroup:
discount = 0.0
if (mo := re.search(r'(\d+)% discount', advertisement)):
discount = float(mo.group(1)) / 100.0
The operator is also useful with while-loops that compute a value to test loop termination and then need that same value again in the body of the loop:
# Loop over fixed length blocks
while (block := f.read(256)) != '':process(block)
Another motivating use case arises in list comprehensions where a value computed in a filtering condition is also needed in the expression body:
[clean_name.title() for name in names
 if (clean_name := normalize('NFC', name)) in allowed_names]
Try to limit use of the walrus operator to clean cases that reduce complexity and improve readability.
See PEP 572 for a full description.
(Contributed by Emily Morehouse in bpo-35224.)
Positional-only parameters
There is a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments. This is the same notation shown by help() for C functions annotated with Larry Hastings’ Argument Clinic tool.
In the following example, parameters a and b are positional-only, while c or d can be positional or keyword, and e or f are required to be keywords:
def f(a, b, /, c, d, *, e, f):print(a, b, c, d, e, f)
The following is a valid call:
f(10, 20, 30, d=40, e=50, f=60)
However, these are invalid calls:
f(10, b=20, c=30, d=40, e=50, f=60)   # b cannot be a keyword argument
f(10, 20, 30, 40, 50, f=60)           # e must be a keyword argument
One use case for this notation is that it allows pure Python functions to fully emulate behaviors of existing C coded functions. For example, the built-in pow() function does not accept keyword arguments:
def pow(x, y, z=None, /):
"Emulate the built in pow() function"r = x ** y
return r if z is None else r%z
Another use case is to preclude keyword arguments when the parameter name is not helpful. For example, the builtin len() function has the signature len(obj, /). This precludes awkward calls such as:
len(obj='hello')  # The "obj" keyword argument impairs readability
A further benefit of marking a parameter as positional-only is that it allows the parameter name to be changed in the future without risk of breaking client code. For example, in the statistics module, the parameter name dist may be changed in the future. This was made possible with the following function specification:
def quantiles(dist, /, *, n=4, method='exclusive')
Since the parameters to the left of / are not exposed as possible keywords, the parameters names remain available for use in **kwargs:
This greatly simplifies the implementation of functions and methods that need to accept arbitrary keyword arguments. For example, here is an excerpt from code in the collections module:
Parallel filesystem cache for compiled bytecode files
The new PYTHONPYCACHEPREFIX setting (also available as -X pycache_prefix) configures the implicit bytecode cache to use a separate parallel filesystem tree, rather than the default __pycache__ subdirectories within each source directory.
The location of the cache is reported in sys.pycache_prefix (None indicates the default location in __pycache__ subdirectories).
(Contributed by Carl Meyer in bpo-33499.)
Debug build uses the same ABI as release build
Python now uses the same ABI whether it’s built in release or debug mode. On Unix, when Python is built in debug mode, it is now possible to load C extensions built in release mode and C extensions built using the stable ABI.
Release builds and debug builds are now ABI compatible: defining the Py_DEBUG macro no longer implies the Py_TRACE_REFS macro, which introduces the only ABI incompatibility. The Py_TRACE_REFS macro, which adds the sys.getobjects() function and the PYTHONDUMPREFS environment variable, can be set using the new ./configure --with-trace-refs build option. (Contributed by Victor Stinner in bpo-36465.)
On Unix, C extensions are no longer linked to libpython except on Android and Cygwin. It is now possible for a statically linked Python to load a C extension built using a shared library Python. (Contributed by Victor Stinner in bpo-21536.)
On Unix, when Python is built in debug mode, import now also looks for C extensions compiled in release mode and for C extensions compiled with the stable ABI. (Contributed by Victor Stinner in bpo-36722.)
To embed Python into an application, a new --embed option must be passed to python3-config --libs --embed to get -lpython3.8 (link the application to libpython). To support both 3.8 and older, try python3-config --libs --embed first and fallback to python3-config --libs (without --embed) if the previous command fails.
Add a pkg-config python-3.8-embed module to embed Python into an application: pkg-config python-3.8-embed --libs includes -lpython3.8. To support both 3.8 and older, try pkg-config python-X.Y-embed --libs first and fallback to pkg-config python-X.Y --libs (without --embed) if the previous command fails (replace X.Y with the Python version).
On the other hand, pkg-config python3.8 --libs no longer contains -lpython3.8. C extensions must not be linked to libpython (except on Android and Cygwin, whose cases are handled by the script); this change is backward incompatible on purpose. (Contributed by Victor Stinner in bpo-36721.)
'''
titles = article_template.splitlines()[1:]
###
# articls表
create_artilces='''
CREATE TABLE `articles2` (
  `article_id` int(11) NOT NULL AUTO_INCREMENT,
  `article_type` int(11) NOT NULL,
  `title` char(255)  NOT NULL,
  `content` text ,
  `author` int(11) DEFAULT NULL,
  `pub_date` datetime DEFAULT NULL,
  `edit_date` datetime DEFAULT NULL,
  PRIMARY KEY (`article_id`),
  KEY `book_type_index` (`article_type`),
  KEY `author_idx` (`author`),
  CONSTRAINT `author` FOREIGN KEY (`author`) REFERENCES `users` (`user_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

'''
# 取出用户id,用于随机设定作者
cnx = cnxpool.get_connection()
cursor = cnx.cursor()
cursor.execute(create_artilces)
cnx.commit()

user_ids_sql = "select user_id from users";
cursor.execute(user_ids_sql)
user_ids = cursor.fetchall()
cursor.close()
cnx.close()

def createBatchArticles():
    try:
        cnx = cnxpool.get_connection()
        cursor = cnx.cursor()
    except:
        print("wait...")
        return

    sql_list = []
    sql = "INSERT INTO `mycms`.`articles` VALUES "
    # 批量创建1万条数据
    for i in range(0,10):
        article = {

            "title" :escape_string(random.choice(titles))[0:200],
            "content" :escape_string("\r\n".join(random.choices(titles, k=random.randint(10,20)))),
            "author": random.choice(user_ids)[0],
            "pub_date": datetime.now()- timedelta(days=random.randint(0,300)),
            "edit_date": datetime.now()
        }

        values = "(null, 1, '{title}', '{content}', '{author}', '{pub_date}','{edit_date}')".format(**article)
        sql_list.append(values)
    try:
        sql += ",".join(sql_list)
        # print(sql)
        cursor.execute(sql)
        cnx.commit()
    except Exception as e:
        print("error:",e)
    finally:
        cursor.close()
        cnx.close()

pool = ThreadPoolExecutor(10)
# 每循环一次,生成10条文件
for i in range(10000):
    pool.submit(createBatchArticles)
2.2.内连接查询

内连接查询,关联表之间必须有相互匹配的记录。

隐式语法:

select * from users, articles where users.user_id=articles.author

显式语法:

select * from users inner join articles on users.user_id=articles.author

在这里插入图片描述
示例一:查询 user_id 为14820 发布的文章

SELECT * FROM users, articles where users.user_id=articles.author and user_id=14820;

运行结果:
在这里插入图片描述
示例二:查询 user_id 在100~200之间的会员发布的文章

SELECT * FROM users.articles where users.user_id=articles.author and user_id > 100 and user_id < 200;

查询结果:
在这里插入图片描述
示例三:在内连接查询基础上做统计(实际查询不用这么做, 这里只是演示说可以在查询时做统计)

select user_id, count(*) as num from (select users.*, articles.article_id from users, articles where users.user_id=articles.author and user_id > 100 and user_id < 200) as temp_table group by user_id;

查询结果:
在这里插入图片描述
示例四:使用 inner join

select * from users inner join articles on users.user_id=articles.author where user_id=14820;

查询结果:
在这里插入图片描述
示例五:可以只查询某些字段(结果只包含username,title)

SELECT users.username, articles.title FROM users, articles where users.user_id=articles.author and user_id > 100 and user_id < 200;

查询结果:
在这里插入图片描述
示例六:可以使用别名

SELECT u.username as un, ac.title as act FROM users as u, articles as ac where u.user_id=ac.author user_id > 100 and user_id < 200;

查询结果:
在这里插入图片描述

查询语法 :

SELECT * FROM articles, category WHERE article.cate_id=category.category_id;

articles表和categroy表结构分别如下:
在这里插入图片描述
内连接查询结果中为在关联表中朴素匹配的记录。

2.3.外连接查询

外连接查询,关联表之间不需要相互匹配,分左连接、右连接。

左连接查询语法:

select * from users left join articles on users.users_id=article.author;

左连接,就是包含左边的这张表的查询记录,不管右边的表有没有对应的记录。

右连接查询语法:

select * from users right join articles on users.user_id=articles.author;

右连接 ,就是包含右边的这张表的查询记录,不管左边的表有没有对应的记录。

users表和articles表结构如下:
在这里插入图片描述
示例一:常规查询(查找出没有发布文章的用户)

select * from articles where author in (100, 110, 120, 130, 140);

示例二:内连接(因为文章为空,无法相互匹配,所以查询结果也为空)

select * from users, articles where users.user_id=articles.author and user_id in (100, 110, 120, 130, 140);

示例三:外连接中的左连接

select * from users left join articles on users.user_id=articles.author where user_id in (100, 110, 120, 130, 140);

上面的左连接以users表为准,先查询用户,再查找对应的用户有没有发布文章。因为都是没有发布文章的用户,所以文章的字段都是null值出现的。

查询结果:
在这里插入图片描述
在这里插入图片描述
当查询一个有发布文章的用户时,文章部分的字段就会有值

select * from users left join articles on users.user_id=articles.author where user_id in (89, 100, 110, 120, 130, 140);

查询结果:
在这里插入图片描述
示例四:外连接中的右连接

select * from users right join articles on users.user_id=articles.author where user_id in (89, 100, 110, 120, 130, 140);

上面的历连接以articles表为基准,首先得有文章,然后才会显示相应的会员信息。

查询结果:
在这里插入图片描述

2.4.子查询

一条查询语句的结果作为另一条查询的条件

查询语法:

select * from table where id in (select id from table)

子查询了除了可以用in之外,还可以 not in,!=,=,exits,not exists

比如,对于如下的article表和user_rank表,查询排名靠前的用户文章:
在这里插入图片描述
查询语法:

select * from article where user_id in (select user_id from user_rank;

示例:

select user_id from users where province='江苏';
select * from articles where author in (select user_id from users where province='江苏');

查询结果:
在这里插入图片描述
在这里插入图片描述

2.5.记录联合

union all(多个查询结构拼接,必须字段数量一致):
查询语法:

select * from table where condition 
union all 
select * from table where condition;

unionunion all不同的是去重复。
union只保留唯一的值,会去掉重复的值。
union all保留所有的值,包括重复的值。

示例一:

select user_id, username from users where user_id < 10;

查询结果:
在这里插入图片描述

select user_id, username from users where user_id < 15;

查询结果:
在这里插入图片描述
合并查询后:

select user_id, username from users where user_id < 10
union all
select user_id, username from users where user_id < 15;

查询结果:下图可以看到,查询包括重复值。
在这里插入图片描述
如果是下面的查询,用union,则不包括重复值

select user_id, username from users where user_id < 10
union
select user_id, username from users where user_id < 15;

查询结果:
在这里插入图片描述
示例二:

select user_id, username from users where user_id < 10
union
select author, title from articles where article_id > 100 and article_id < 1000;

查询结果:
在这里插入图片描述
上图说明:
1)、合并之后的查询结果,列名是第一个表的查询字段名。
2)、使用union时,两个表的查询字段数量必须一致,否则查询会失败。

2.6.查询练习

1、根据用户名查询某个会员发表的文章。
2、查找某个城市的会员发表的文章。
3、查找某个id的文章会员信息。
4、查找某个区间内的用户id的用户文章发布情况(比如id100~200之间的用户)。

三、课程小结

  • 01 内连接查询
  • 02 左连接查询
  • 03 右连接查询
  • 04 子查询
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章