Candidates
CandidateId | Skill |
---|---|
1 | Sql |
1 | Dw |
1 | ssis |
2 | ssis |
2 | sql |
2 | vb |
Jobs
JobId | SkillReq |
---|---|
3 | Sql |
3 | Dw |
4 | ssis |
4 | sql |
4 | vb |
- 一条 sql 查出所有符合岗位要求的候选人(候选人技能包括岗位技能要求),即所有 JobId-CandidateId 对。
- 一条 sql 查出所有精确符合岗位要求的候选人(候选人技能等于岗位技能要求)
Expected output would be, Job_ID need the respective candidate ID with exact skill set match.
建表
CREATE TABLE Candidates (CandidateId INT, Skill VARCHAR(100));
INSERT INTO Candidates VALUES (1,'Sql'), (1,'Dw'), (1,'ssis'), (2, 'ssis'), (2,'sql'), (2,'vb');
CREATE TABLE Jobs (JobId INT, SkillReq VARCHAR(100));
INSERT INTO Jobs VALUES (3,'Sql'), (3,'Dw'), (4,'ssis'), (4,'sql'), (4,'vb');
- 模糊查询(Candidate.Skill_set >= JobId.SkillReq_set)
SELECT a.jid AS JobId, a.cid AS CandidateId FROM
(SELECT J.JobId AS jid, C.CandidateId AS cid, count(1) AS count FROM Jobs J, Candidates C WHERE J.SkillReq=C.Skill GROUP BY J.JobId, C.CandidateId) AS a
INNER JOIN
(SELECT JobId AS jid, count(1) AS count FROM Jobs GROUP BY JobId) AS b
ON a.jid = b.jid AND a.count = b.count;
- 先对两个表做有条件的笛卡尔积,为子句 a
比如集合A={a, b}, 集合B={0, 1, 2},则二者的笛卡尔积为 [(a,0), (a,1), (a,2), (b,0), (b,1), (b,2)]
SELECT J.JobId AS jid, C.CandidateId AS cid, count(1) AS count FROM Jobs J, Candidates C GROUP BY J.JobId, C.CandidateId
举个例子,JobId=3 包括(Sql, Dw),CandidateId=1 包括(Sql, Dw, ssis),两者的笛卡尔积应该是: (Sql, Sql), (Sql, Dw), (Sql,ssis); (Dw, Sql), (Dw, Dw), (Dw, ssis)。
总计数量是6,为了得到技能匹配的数量2,须加限定条件:
SELECT J.JobId AS jid, C.CandidateId AS cid, count(1) AS count FROM Jobs J, Candidates C WHERE J.SkillReq=C.Skill GROUP BY J.JobId, C.CandidateId
此子句将得到技能匹配的情况,:
JobId | CandidateId | skill |
---|---|---|
3 | 1 | Sql |
3 | 1 | Dw |
3 | 2 | Sql |
4 | 1 | ssis |
4 | 1 | Sql |
4 | 2 | ssis |
4 | 2 | vb |
我们可以据此得到对应的匹配数量,比如 3-1:2, 3-2:1, 4-1:2, 4-2:2
- 查询 Jobs 表,得到每个岗位要求的技能数量,为子句 b
SELECT JobId AS jid, count(1) AS count FROM Jobs GROUP BY JobId;
- 基于以上两个子句,做 INNER JOIN, 条件是 jobid 相同 且 数量相同。
SELECT a.jid AS JobId, a.cid AS CandidateId FROM
(SELECT J.JobId AS jid, C.CandidateId AS cid, count(1) AS count FROM Jobs J, Candidates C WHERE J.SkillReq=C.Skill GROUP BY J.JobId, C.CandidateId) AS a
INNER JOIN
(SELECT JobId AS jid, count(1) AS count FROM Jobs GROUP BY JobId) AS b
ON a.jid = b.jid AND a.count = b.count;
- 精确查询(Candidate.Skill_set = JobId.SkillReq_set)
- 将Jobs 表中每个 JobId 的 SkillReq 升序排列,然后组合成字符串, 形成子句 J
(SELECT JobId, LOWER(GROUP_CONCAT(Distinct SkillReq ORDER BY SkillReq ASC SEPARATOR ',')) AS JS
FROM Jobs GROUP BY JobId)
AS J
- 将 Candidates 表中每个 CandidateId 对应的技能 Skill 升序 排列,然后组合成字符串, 形成子句 C:
(SELECT CandidateId, LOWER(GROUP_CONCAT(DISTINCT Skill ORDER BY Skill ASC SEPARATOR ',')) AS CS
FROM Candidates GROUP BY CandidateId)
AS C
- 最后联表查询,条件是 字符串相等:
SELECT J.JobId, C.CandidateId FROM
(SELECT JobId, LOWER(GROUP_CONCAT(DISTINCT SkillReq ORDER BY SkillReq ASC SEPARATOR ',')) AS JS FROM Jobs GROUP BY JobId) AS J
INNER JOIN
(SELECT CandidateId, LOWER(GROUP_CONCAT(DISTINCT Skill ORDER BY Skill ASC SEPARATOR ',')) AS CS FROM Candidates GROUP BY CandidateId) AS C
ON J.JS = C.CS;
注:主要是用到了函数 group_concat; 另外,加 lower 函数的原因是,去除大小写干扰。