破解58同城字體反爬

1. 前言

最近接了一個私活,破解58同城的css反爬。(被鴿了)現在決定把它開源出來,以便大家參考學習。

2. 主題

首先,打開頁面,瞭解到這部分信息是有字體加密的。如下圖:
在這裏插入圖片描述
這部分信息包含 性別 年齡 學歷 還有工作經驗。 這部分信息需要經過轉換,才能達到我們想要的數據。
可以看到它數據加密部分,都引用了一個叫stonefont的class,我們觀察一下這個class
在這裏插入圖片描述
經過觀察發現 這裏引用了woff的一個字體文件, 我們把其中的base64編碼部分提取出來,保存爲一個.woff的文件。
python代碼示例:

# -*- coding: utf-8 -*-
import base64

font_face = 'd09GRgABAAAAACJgAAsAAAAALkQAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAABHU1VCAAABCAAAADMAAABCsP6z7U9TLzIAAAE8AAAARAAAAFZtBmY2Y21hcAAAAYAAAAHrAAAFTgf83VJnbHlmAAADbAAAG3cAACFEC30Q92hlYWQAAB7kAAAAMQAAADYZ7i/JaGhlYQAAHxgAAAAgAAAAJBFoBf1obXR4AAAfOAAAAD4AAAC8Ri7/tmxvY2EAAB94AAAAYAAAAGCyTrsKbWF4cAAAH9gAAAAfAAAAIAFCAJZuYW1lAAAf+AAAAXIAAALQd5CEoXBvc3QAACFsAAAA8gAAAe3GCLPEeJxjYGRgYOBikGPQYWB0cfMJYeBgYGGAAJAMY05meiJQDMoDyrGAaQ4gZoOIAgCKIwNPAHicY2Bk+8g4gYGVgYNVmD2FgYGxCkKzCjK0MO1kYGBiYGVmwAoC0lxTGBwYKn70cZT/fcHwmaOcSQIozAiSAwDEswwFeJzN1L1O23AYxeFfnJR+0YT0g7bpR2jatE2aMjB1rMSAhLgAsnRjhYEFcQsgLoAVMXAxIMEahiAnIZbtmMgxygQ95mXvFKmOnsS2HMf6n/MGeABkpSE57c6S0R7OjM5m7s5neXJ3Ppfp6PgPv/Wdj2y1T9v+xZq77x67551et9k96d1cLveL/YpX8Krekef7jWAxWAm2w3K4G7YGzmAjyker0dmVP6wND+JSvBkfjrKjZlJPlpL1ZCdpXVfHC+O921v9zqTvP7ktozWa3Cu9v6NMckphioc84rHyeco0z8hTYIYiz3nBS14xy2ve8JYS73jPB6VWZo5PVPjMF6p85RvfqVHnh3L+ybxuPjXRtfnHwv0n23T65vy6P9KqsHVPj9g+NUqBtm/SCbpYM+kUufsmnS732Cgt3HOj3Oj0jBKk2zTKku6JSaeud2OUL5fLJn26ftEoc/oVo/TxCob0s2rUCLwjo27g+UYtwW8Y9YVg0ag5BCtGHSLYNmoTYdmoV4S7Rg0jbBl1jYFj1DoGG0b9I8obNZFo1aiTRGdG7eTKN+opw5pRYxkeGHWXuGTUYuJNoz4THxo1m1HWqOOMmkZtJ6kb9Z5kyWgCSNaNZoFkx5Be2zKaD66rJv2nHC8YzQzjPcP8X0MmGqcAeJxVWQ1cVFXav88595yLiMgMzAwaKsMwM4hIBMwMIiGySoqIpOQSKYuGSERoSuQaESohESISGhoZkZqZoZm6rZqhua6x1hq5amZmSkZmrln5AXMP73MH7bevP2fmzjD3zjnPx//juRJIUt9VKVoKkIgkuWIMAcMDwiT8R/FzVX6Ve0uDJZNkliSTzhxtNARwxTAcDAGyBXQx0U5HrJnqnBYTHvmZ/wH/PP7bmhc2fiS+uyB+37tmszh3LAdeef1d8Srlgz55s+LISFn/r+Zjt1iOuG/V81+rT6ldnK9ZBt74e1ySxGdKJrdLA6WR0v1SjBQnjZMkf7PO5lAcZoPF6aBmpyuGK2aiEHOI3WY3BrucLjNh9lB/s9MOZqfJbmZmhxnMJmUkcdmtiukoj63x7XE3sRWkc+CA1oHedH2A3iY62zu8jOeg0Ojn4+6ZrI4sII0z1IlchgDiJJSpHeRz4sRn7UGIs8dNnLIsjvQQbnenUMozM8vzRUV67uNThGgUB7wmFAXuWlQjJn5Uvx6mDc+AdcHy8e5uKI78sN1G6TtaPLU4r1V+4sOkcIyzWW+gVuawmnShJuZ02Z1mZlKszF/BpYfiK3VamXEwhA6CQppHqcgWbvjuU3Xez+JDmfBsups0MCYGiTuck32ECB0kqHcoXQ3x8AYDWbSpO902B8y/BbcJacE9JHBg3o45sgxv4OXiS5NycRtis/gVLzfkviIZ6N01hnndxzukoZhvjL1TsYdamF4X40fNBrOecUWiBGxT6Jjx7m14Dvm4CLxZG6PPuqvqqDE/Wn3mWLMaE5YIC8i3G0xDKIVBoBO/iv82u7/rm5sEnLyhvikanJ7fEl8oJ/h6rD0JZAZmh8uMgRgB48CuhcTsHAdOu4maMQ7mwVpO7UwxG/hgMJoUyQpuGAAxcKS3ZY6wUPp9+V77xEdEKiWMPBTgPSXrgdVP0A+4TNw6ZZ78NedlvfPmyHXv9RVU1c0V3iIYxpaugZWNt/2qGaGP/p3z5wIGRNz3iRaVm2P0U029XvWE3Gxu+Do7aWITeJ27m0NJUobyEZJVkqy6WJeFK2DH/za7xWjQ2VyKP3WZcH0xThe4TP7UpvCpg9nQAAExCWJooD6Ie+cGPUfpsjr4NFanxBrgi4G3OJ+d6iOTF3YtjOh9gjE5Lzzs+aLP3QoWIs0j73Dee3bphNUxcgjn7mnqJgpap4rzyh7eJAVJFkliWDIys9lDraFWvT/GEcNnwj4xmAnIFNtFNknKlGKxR7wsTqsvhMaTh2CY6Bad1YWCjMZyGr8rY4r8jvjLn8W3kLtafaJ4HpAesJSvO/TRpXdSspd99PJLEAZDG7HQfgvtOdmo/iK+2kYviDNnSp4FgwcrxD5lBw/C3o2WXFrX6geDS3FYmcECXAkFoi2vv54UXJ2WaCxwreRxnUa7YtYa3cpUthvrlL7m3imON/Fd8SHqEnF1tzrTmEw2bnSfY1h83z4GJ4xc9SZEgTx2HyGymCM2e/nROi3Z9RBxrLcMTPQpHiQitvnJcib+7z2VkZoFjaKhLiF7KcTGBS3E7+ZXzYoiQIPHxMhydV5CXLWo4T7iFGOSJON+rnoBnyINkeKlqdLDuCMMrPN+MBvsCq5YZ4vBnTldlCs2MGtI078zk4KVGmoxaaWgV7B0HRTLI0bnp7hkbGy7Jwwup85q0cWYL88VPVh75Gi025kKZe/NEBeqGfDY+FhaCMkuEGufwlaGW2/QSFlWf+Nx8OzszhplMK5dncX5omBoapiQCglnCsTV61WiMCoSqopT1KE3bpAEqIKj02KhkvPu4+4JnFdM/C8cK8hJIB9e95blRGMeBZJvC+uktPca/iZc30lSCvPd5YRQWXwuLmxNO9Fac2G7uNmSmth4rhaMHWqa5Km7S0oX9uwYKR0RQu+PcGw06ayKhmT4bNa5TDqnMgIU22DQWfHVrHdpmGbWGU2ylSuyNZQSjJlLjwE1Ear3xzIn1GbX+5OhzT6Nsn7mJvH7z/Dko+Ld+2Dh5GlvBp+mdMrLy+HNAYhbVdiekIUHW0sbQRDf0AAhYTMHtMUVh9HOJft2NS9aV7f3w53VLc27zyQm3th3/FmEnMDtjR0FVCaklhCIk+Va8ObY5Yhf8+wYB3GM0sVDxDdHLk0dJ6rCCPfNnkTkAMgQoi2Hl2eKU+I8REBarCgR28UJcSoPjMDF531SchgEwwTI1mDBC2NzW7nOP5eSpBQpTZohPSbNkfKlIoQNrBAbw+g4MESI7g6XyRA6ApCuXBghq8mgQZzVpD0pRoPiNCH+YwEhBGKH2CyKP7hk7HHtMmYTNXlaHbC28FQXwiZ+y6r9SWd22JkJxhLs07nq6V/IcLHicfiP+AtsLw2B12dHIqBkTGIMwveKYIwGRP6G9aXGHyTkyXlwlPOa0uUINoUY40Feyyh9OR62QkiSqHHfIGGcq2fYwCyxSbQFhUOB3CQkvMKI77jaQyQhYUnmgcCoEhBRIprSurAq7OTqds6rkWEOcH4Af3HB/WWMTSqhtHq+I5TzaK90WSYVEVjQSGV/w+WN9GeDwjmno9yLlyzMqy5n8VMXKz0/la+blZCWRaPns2zOV4oHD6QPv2pbJ7wOyHKlFnutLq8o/+arUTk8pKEPhtYayjCCSKCmwcAwqhh3Az50ARaPdIiJfhDjqYGlJ+7MYJaYBlQ6u6KT+gPsGkmbEV5gqahUe+KwM9rEKvIEpRsFtg39k/oSqVAZCVZXXys9wPW7lpa48+R0aqv2wqaGVbBeTcD4NcaGqUK0/3mM6FbPBKYQLwhR0xnMGjZkejun1HcphuVPs9OavAYqFORaGZhwX5Qn5OsLLi6QZ19jLNCJ9F0rjmVPT94OcaIEGbxRnDufML0NJO97nERH8WzEXuQkX1AczlBXjImSeZFfz34OAqekyvIoNjpykak9ZG1yZc8jHrxGkFvCM6RA5I8IjFis0xUJjthEiNbEHVeGg58vBEeCK8DoUXm2YO4fYIx2xtpC+Mst5fNKn/70146Fi/PLN5Z3i28vV7RvW1Wz433xy864xpPrX/lmDXxVdTFqzK6C4r0F8/cuyP8g/oGL4tYXJSWdtTVb3llZseNdkl6wdt2JhtUa1vbNV27zQ6hAbdKD0gTckiOUmYwa0GqgaeVaOgYA1jcCLSZKhwCrpRKcLq5wZtde7B40QTlokgDPMMiMJsMNmPoi9vfynTD6Vn4fyo1rYnVYHEmBPHcJiRfpnEMsOSKyoDbqpBoAv/f+VxQdueLDb4rW0NfePFxoO7uvad/BX9OnNQohwqAFhnNxXVSxT+sxaemTave3vLaueufWT9ZMSj4IHT2R0F2u57xeBMpyPTIL2TrUlpkUktDY8qvPgGoIga0JYoPo4i+8kipyVd/MSdOT4/I8msstj+cWKRh3HmuzhHDFkQgx0bKWB19iMBusWMiYGlci0Ic3VuceeSTt0PJOkF4/eHhFLCXuTYtkefLWXXtflBc/v3ruYwenZX7X/m7PiqqSwrQ2zh8Iqtr/flnFoXsalCmdnEl6rUv6FadL0w+gs+jsLE5pQoo/rC7ifBZ0gLNUdIiMvQ+rYzl7bDIK4JVjK8SNM63XwKu3kCMF3b2mOKsc4o3SKKw/FIig8yg2ncYAoSYF4d/fjprRLmkKj+likOYNNjOTlEOxmSJIXIAODYOdwyyzxKTIDCzvs4T0qb+kO8kYGOr+agtpb4Z5BJaIk+JZSOyOzYJnyEPubp4VKfI1QbABcXwVkNn7hSzCUiPhSzKgEXVngWjivJExdRtKNty3VSnhT6GoDUeF4kRWT5ImSZ4laZVm1js0xHVgSGwGRY+oakLidoSamdFF9bh4vQ4B16AgsGh12P9Aykfwdd2P8OIi8STlkXGaIk7h3KVf4PYixLsGg9nFMwgJL+x9EBaIKOxwWfQ4E8+zAz3nZJmFIvBFwS6R3v9we8EuTbzjIaVpYgMDOqeLkNdjnxPPXnE/jmeXwGIES7L4c1kZq35GabEsjw4WO9Qi0ZacBGEkiNwIm5vz71k9eyAQWlDi5b4985/V9dWyfPH4PayQJ/F46T58EyAhFDqMwU4/LDrFX2NrOyPIO9GyPEl8Kc4++TpM/Oe/P2p4Zgz52ECIUf3Bf4Bx1U0YQrwuiSvjj+V/AGGbB1E9pc/fj1v9bvJ3/HdPP/spP/Oz2M+aTsaqABfymEsTRzo/LZZDkOxQNnlaFqub+bs0GQCeZ3q6YWb6nF9JHKV3rohzUROuk1NLxqur6evNO9r2r4FrYmzvqqKmspJtsyB3Q0HGznhem15cgfC83CiMaoJoTZ4EPqSaVDFWfMcNq8meKAxPBRa2XByLZi0hFmqJV4RoEsmRUIvh6LslTij7eRmuNUFK1qrCroloBBsUd6ipUUnrPObDoG3DZtdZ9FjQim00uCg6X1z0ODAxI9Ukj78dN6N9y7Nl24CuEnEEJsnyhmrfoVAJ5bG7WyFRluNET1b90oUVi9ah9iptDSFbqR/nzy+oxbfje33Eh8gMMDaHEFU6P+t8TUGdOJDbVVKQcxOiYUnd3CN+PqmpQQmNgXpxbN5IQi5mlJUuqi90R8AS2POoOH5q+uGUAfK/8PQuVWwd6F8/PFG89XeNlda/lyIvyttWUdogOvJ3FS6de68mHuUhkg++IRK16BEaNASWH/1N/Gg+boWAH1SJvAdmb7ZFzFRHiZ9ARw7AK+9h3/bPCUxKLT+N+b5PcyP+GLkYNLI6C8pBi05TzE5/qxk/0SvYNDpUNcpJSmR5OxK99yZ1w3a1dfZDhPLB6gaUA/mc9qwiOah4WzlllgsnxdJI98IiNZB0F3H+Fzj4rYh3wsElYp8vzF/56JF7uPaIcpi/onGZSRNCZv1AMGv5k/w1J6RDPtSkgKKZDEOoXdFjLxuwvX1pi5pAjiCOqNdffC88GYZu2uRIgkPbcSUj1VvkHKWqDKHYS/mpFxijBe469araaQCxATpFHRRfP5yMEhyORTRUwdax4hUUPc+NENGcb0A5I358KWedLGv6pK8LY2yVjFIILhf7jdAYZ0y03qA1oaTzk8zRJrjHtUgB9PZfKz/9jbDiXz69Ln46fVX8CnMgeHOOmvHWyoqWV1+u3MQmJIpW8cV/RM9Xl8Q5eBpmwAa4NNoN688eaGzZtfteXFR5APdD3JOYGc0pbj6G3ZXkeh1F0WmXuftjzs1EFdsPiiSNt8Jx06cQiL/eSPHdJYqKbBcC17CSZVsGivUwSDxM6VvJkscn9T3g5ce/lCKlKeiS/qwh6l3vaVM0a4d+ABEUJS4aJjz4Q3w5MEdOlwXBVeHaJy6nXfMAuESbSfEQCFGMzKU3KURrQLuTBsOoByKbIkYRdfxmdc9jWZA2KQPNPlQV/f6CqMSjq1VLffS1BeWCz3/sthwdfzO9zC9o7syaqKCm7FThRrnpV0KSsGarQzfRivIDpcu3q4nLtzZVbWF7ahvCn3+2DjvkzneQRcj8QIGuvF6Wy8vUsitkeqZfOcluuHjzCfdzoiXS12f6zEAvnxutR9EUD+85JrYhJCtDjYY5IqnozM2cPuloFoRczbjdnv1hfw7+Rq+w21rmXZ4M+5ktSDvBulhzCIq1mGijwY95hmoaIOe+caxlv3hs2ZN+8Iy49FZ907+OXhcdWz8RX/acW3nyudcrwdaKuQnoK/571r83ixN/lvWfrvuiT5reP7MT/1CO82WSzoO9kmY9qV5DrX4DrqdgNEmKFlA7MrETjAi5dnZxU93tczWfHf9BREUmwu22fbLayL1FcGlywqIcsukceFVtam6u4DNEk3pa7O+TbpeL78WbRxPTuhGRIki26rc8QXSmRzo3Zpay4p7CSjbrpDiTBbEnPdr8R8SHZdibD3sYF52/SZHRGvKBgALO7ikJLBOnv8WjE+69mB2g18AEQRb7V+knahM67Xs47PTgMA9U5aFkKuxAW9ir/hRE60EX4J2a8mFmi6+xomGbXh/Qnat2x0WBO/1iociMiIXyueKGajs1U1yW5U1p5+oZuxqliZAuxibGTqxaUlpYghKPFzUFsZfrkTgcszeKdv1UVG8fdWdmET4zUy/zfVegHZmlJN0JRSRKRIgjyYnIukuahBdMggplGfqj6oMdnK/IefxwfXxEfEN1fm117w0DvfawqD8zs9+39P3KylBHDdAUGbPYqZlaIGYAoCO2cAX7teyb+1aoz639hqSfgl6x4crAQWygb/foQX+BySKLD+r5Fp4klSOmptjUMurJ/znlE74Uc58hzdTyr/enMUZ/qs1Y7Kx/gGcxUSeGkHvKwKGz3FM72HnI/k4LIwq516eYFITtccA8DtZpRyg0SYzsuIZiBPUJp9ok5Uux6TXYmR4r4ta8T0jhAV991eHU8KJJ12R1OalgLCHrv5lrA73115bC3EDfQPf0vT7GttxKvdFH2B40+o+A9Zx7iQUzOMgPwEwILxXHRTONpOPVdjIlBFEudtqEFzG7pr60JJhBzC+pX8cWzUqeUBIbXjMphzQk0/g9AgVKIjaw+PaZNvzqon3iMk/L8atshqHk0dPdjO15HTOsGySKOiBoobjsmeeRvp/7zsn57EfsF0RI0DkVE9IVhsSu1Z2EkfGnnNDyV//akVd5EDlpNGOHoOoFYYRbslxwjf3oBrxsK2OisXODbfFXxImaAx6HF/aLN/t9qVDu8OEe3an5Uk1YmLQJttmIIHy3zlmMLiDGjLHuz8pICNY9Sv3CvdSZPItzd4+vjXZirN0+TWpmRBLJ3K0GyFJrA+ir9X5eabMC/IaiP9kieuiuEyA1ezX2SSdys9eps0TthFQIJ7n16o2NpYv2Bg0/mncMVV9/3fXIT/Gx6CA1J4D0E4wchEzk76cdmYMXgB8QuCp+6rkq1GMwBmJFpzhE3Mgxm92XxRIxB2pgmfoqWUxe8lwPyWAOT8Z9WvpRTq/jdo4XdMRKMdEeggux+/8PwZ2saDzxPSH5v3zSJ0Hgtz+g7N0pTr/19NMtq5Zu2bzy+U0/p2IXRRJyGHxOdYFVrBWbxQwR7ZCDmv/29rNvf31QG8X2XRA/K8dQ1w+TwlDV9+u3aZhJGoNSDLNIXagx794tuFfN/n9IT7MLlD+G2pgJQE3AFFM/7XiemdPKMSfgFdkRGybmck5yEgc84+vDhShq8ParX1jr5eMlgs2DIkgy6axf2FRaCbm9UZR8absB+UnhouX8cdEQmgR1J8VWGod+Ki39Rl2msyFezKqdm9yY7M7tBiORu4R8Rz2rFdIMh/uUZ+B/XZzgqZn6becgBSa8QQb07IM8qIxCEGqrGaUMSRLTj2QkgAzFYp24EJsKAXBEDa5B/VFqEycvhENAa4ToOh8NUkBkckZiVIqkoYzUN8zLG/k6AiOVIxVIT0ul0vPSCuklzXXbFYPFYVVQ6yr+WgNgrToNnhmV08480yltbGU0URtilA553l9xxngaBgXVQNCABKUfIvQIMGijq/83atGEgd1mVYx2po3DLIoRrRRmgGmfIwD1y2abXeuBEx98Ui8uTCdk5HzUYYRztZnkYWge5HzLMLilDUBf8YfRtg8o5TzsjrHjT/JUWQ5u8sqZO20iRqBDhst4HkH753xxpThIZH590VYvv9by4l7BEGFL8ZOx4uRgBmoPLGKse/J4QuliUQOl1P0iLEIVJOrog9n1FbmlNG5JZWJDIdwp5ryiglK/KkpJV0TYrIq36QTON01ApogPiGasEu2f72lV5WQDicUfEGU7UUQQ0huAav4beApax6kO3/x5gxGmKpawJy7KSZm+RZ1O2D5fHoYCJsmY43pYlp2+kFw1ffrwuTzcnZmXk1+Yx9qTE/ITncjvfTfER0o7cuko6REpW8uZJ3roq9GLjNBQxeGRdpp+YkZtoGX3TAtdNsqcLkyT04486hnS2j2Jc7oCTIqJBvSrME2XYbGb9SYnOUNQXHH15o9gJiQywYmJCEY3fQH3so0TeZVwfwdpabFi/0VxR7SEJ8E89fwC99pXVx1IRzV9yR5RvDZhTcHyxU/JaGGPZSz08S1NmLnyJb5QXZbdeZX8Ujnu6OgGQpJBwoRtaZ716OrcznDfJLHw9vuzs5IrPzr6clJ6yYc3oLpxVDjsYOyxKOAcKlGHluG6NGu9zBqamzCZqRUnMdWZM32ySdTo+yFeHDl74a72zVN28AptruLyDI+0m264b/1gCPWICaxGbZaEhWdeS664q9GPLJHlJYt3aB77x5SH8OU14f092RWUKQO7s0SmOTGL4bMVEI7sJ152dODLzKgKzL+7FWvCWaDBIEPMv6z8wHei3ouTJnh0sefugfb7/RNfFDSYIIvpDyGj00biGHREJaU/a7w/a2hiPB2heDoF/4oi2eo5a0TnTHENcxGIK1vFCPl41otYVt2UptQ5ixJzGwqqhtKrRRsbczdUbrgI2VFO0XT2sqgKi4NF57eVbCFB4ZOq0p3LyfLeNAbmmp3tCcknSHDR2SNL22A35ENuhTfUvrWH0sNogaxD0h6Z59ydmZFT01sgR4i2IzOhTXTt3QuRcWJL+4Hc+Fnl7ecakjKWH2+HFLJlUWN8TVl8bVHTBHFChIeD1+7s7t8Kbnju8d5Q/sHb0GtapSjE6z9JkzVNiI7E4ghFi21XqMXfjkdGw/+aBSvTU6tZh8bNqKHT3ahgULTbvgZnDJa/A8+2wpP4RZCb1Jpp4tKMiZOaZJm9thQQJ7qXt/oEfDh3qXi3930ugw+tVitLvLDPX1LULtoYEbZujJM+eTuDEfaZe/h7Y0kPP/yTWAHj5ZPCsnN/EiINGne3TKOW4mHyW+LBm/KEbN+lBI27mg37CkQO1oFPEDIWT3hkfMLjCypzQkTFTkJa3OHbGNtG7t5TH6Y08U+kUOkBrA702/+rzglFzumvlP6shyK62kye+0f9/Wl0QAxapMEgR6V2djXUbGk9IzrDUiHx1JnOsjNtNTnu3YyRsJS0mry4YhiwpjY7lJRsvFZ4ykffumiDiGzifErIKPl70SDKREdn4Y3LYufVuPRrqLoiwZgtuq6ABPnWp2rKUlpK4mS5gVP44M4G4cZdTc8NIE2sSH1NfQYbw+P9zEor9lcQcu9D2v0gwGXiZpwu5xCwsj8WjfkLtWLO/JXQwaBNA1xUY4DB4LJbTQ6tIbQZAeohzSZa8c+gJdqkQEtEVFN8XHlYMO0iB2/XtXkFHi6oU1eoTctAplmouQ5//faopxhsIwjevcPqsB2kLUVRqLu3bBWYDbhkrnUvrEITmytmz4jGb60X8sHuU6kJ7Ye3Qiv1YuzOEaF9MyHfp4o1YOW4m8VbHyMgIc4fWkGq9hiwqQqHM7aTsS3e1YS8v+MYpfgqPzBZTcMqovzVZ6T/A9KIhe8AeJxjYGRgYABi02dz9OL5bb4ycHMwgMDNzE9zYPT/B/92cUiz3QJyORiYQKIAadkN3gAAAHicY2BkYOAo//uC4TMHw/8H/59ySDMARVCAPgC5qwd3eJzjYACCFAYGVlYGBg4GVMx+EVMMhlk2QjCMjS4HYf//jiwH1pOGqYd1IUzs/1tUs/8/gJrzCZsbADb6EhsAAAAAAAAADABEALQBBgE8AZYB2gIiAooDFgOkBGIE4AT6BUAFwAX2BiAGcgb2BygHgAgACCAIYgiyCO4JHgmwCeYKMAq+CuYLfguuC/oMIgxeDPYN1g5sDqgPNg+2EBwQonicY2BkYGDQZ+hi4GQAASYg5gJCBob/YD4DABuIAdkAeJx9ks1Kw0AUhU9sVWxFQcGVyqxEUFN/du5E0W6K0EWh3aXpTI2kmTAZCz6H7+DT+Azik4gn6VWpQjPk8t1zz525AwNgC+8IMPv2+M84wCazGS9hFcfCNWzgQrhOvhJeRhP3wivUB8INHOFBuIltvHCHoL7G7BKvwgH28SG8xN5P4Rp2g3XhOvlQeBk7wY3wCvWBcAO9YCrcxEHw1lDq2unI65EaPitjM38SR84l2rHSSWJnC2u86kdtnXT1+CmN3I9aifNZT7sisZk6C0/nC3c60+77mGI6PvfeKOPsRN3yTJ2mVuXOPurYhw/e55etlhE9jO2EcyuuazhoRPCMI+ZDPDMaWGTUThCz5rgS1p30dJjFzCwK/oY+hT59bXoSdBnHeEJadf73/joX1XrVeQWpnEThDCFOF3bcMWZV19/bFJhyonOqnj3l7codJqRbuafmtClZIa9qj1Ri6iFfUdmV8920uMwff0gXd/oCX7iEuAAAeJxtz8lWhDAQhWH+dmjneW61nWdtCATCkibJu7hx5zk+vp5clmbznQTqVlU2ynQm2f9nxogFFllimTErrLLGOhtsssU2O+yyxz4HHHLEMSeccsaEcy64ZMoV19xwyx33PPDIE8+88Mob73wwI8/4GX9/fYbcFzLMk0Vtk2Xbyd5L3yerUCetbWUfknWUTaU6VyjXGZNs8yiN8ttO9e3Qd26U31fl4HBvlOcL1Xund+9VF3Ll/w2QjLnmiNZJn77HYb9YmFZ2qU8si042w71L88Uq173SntEa/WerRtZBOuVazRNr7Rddqb7OhSz7Bdvma/oAAA=='

b = base64.b64decode(font_face)
with open('zt01.woff', 'wb')as f:
    f.write(b)

然後使用網站http://fontstore.baidu.com/static/editor/index.html 打開這個woff文件,如下圖所示
在這裏插入圖片描述
可以看到每一個字都有對應的一個編碼, 通過觀察我們發現,這個編碼後4位,跟在網頁源代碼中的編碼是一致的。
我們可以用fontTools這個庫去解析這個woff文件

# -*- coding: utf-8 -*-

from fontTools.ttLib import TTFont

font1 = TTFont('zt01.woff')  # 打開本地字體文件01.ttf
uni_list = font1.getGlyphOrder()[2:]  # 前兩個不算
print(uni_list)
# 輸出信息如下
# ['uniE0D1', 'uniE0EB', 'uniE165', 'uniE39A', 'uniE3CD', 'uniE3DC', 'uniE4E6', 'uniE559', 'uniE5CE', 'uniE6FE', 'uniE74A', 'uniE811', 'uniE822', 'uniE90F', 'uniE925', 'uniE9A9', 'uniE9EB', 'uniEB2C', 'uniEC43', 'uniEC4C', 'uniEC7A', 'uniED1F', 'uniED8C', 'uniEDDB', 'uniEE02', 'uniEE6F', 'uniEF0E', 'uniEF58', 'uniEFD2', 'uniF0EB', 'uniF129', 'uniF1A3', 'uniF31A', 'uniF373', 'uniF3A5', 'uniF403', 'uniF459', 'uniF52A', 'uniF547', 'uniF56E', 'uniF58B', 'uniF5DB', 'uniF625', 'uniF832', 'uniF88E']


這樣我們就得到了網頁中看到的編碼。然後我們自己手動做個映射關係就行了。一共45個,花不了多長時間。網頁中看到的前兩個不算。
你以爲到這裏就結束了? Too Young Too Sample! 如果單純是這樣,那就太簡單了。
後來觀察發現(其實早就發現了[小聲BB]),每刷新一次,woff對應的文件都不一樣,源代碼中的編碼也會變化。那我們前面做的都沒用了? 不是這樣的。

重點來了:

我們任意保存兩次請求網頁的woff文件,然後保存爲xml格式,靜態分析一波。
保存爲xml的代碼:

from fontTools.ttLib import TTFont

font= TTFont("zt01.woff")
font.saveXML('zt01.xml')

xml文件展示:
在這裏插入圖片描述
注意了,前方高能!
我們在網頁上打開這兩個woff文件, 找到兩個相同的文字,這裏拿性別來舉例,也就是 “男” 這個字。
在woff 1文件中,它的編碼爲
在這裏插入圖片描述
在woff 2文件中, 它的編碼爲
在這裏插入圖片描述
我們打開這兩個woff文件對應的xml文件。分別根據編碼找到這個字的具體描述部分

woff1對應的 ‘男’ :
在這裏插入圖片描述
woff 2 對應的男:
在這裏插入圖片描述
通過觀察圖中的pt值發現, 後一個pt標籤中的x,y值 分別減去前一個pt標籤的x,y值是一個固定值,就算切換woff文件,再去計算,仍滿足這樣的規律。 這裏woff1的“男” 前兩個pt的x,y值相減得到
(1786-198,1575-1575) => (1788,0). woff2的計算結果(1822-234,1611-1611)=>(1788,0)。
通過這個規律,我們就可以製作映射關係的字典啦。 用計算結果當成key,對應的漢字當作value。

# -*- coding: utf-8 -*-
from fontTools.ttLib import TTFont


zt1 = TTFont("zt01.woff")

# wods列表中網頁上按順序打出來
words = ['B', '男', '王', '大', '專', 'M', '女', '吳', '碩', '趙', '黃', '李', '1', '8', '經', '2', '下', '本', '屆', '5', '應', '科', '7', '中', '生', '6', 'E', '陳', '3', '以', '楊', 'A', '張', '4', '無', '0', '9', '驗', '博', '技', '士', '校', '高', '劉', '周']

uni_list = zt1.getGlyphNames()[1:-1]

data_map = dict()
for index, i in enumerate(uni_list):
    temp = zt1["glyf"][i].coordinates
    x1, y1 = temp[0]
    x2, y2 = temp[1]
    new = (x2-x1, y2-y1)
    data_map[new] = words[index]
print(data_map)

ok, 到這裏字典制作完畢。
後面抓取數據過程中 我們只需要 抽取抓取網頁的woff文件, 計算每個以uni開頭的值所對應的key值,根據key值到data_map裏再取到文字。就可以製作當前抓取頁面的 字體字典啦。

驗證

在這裏插入圖片描述
完整代碼: 傳送門

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章