mac安裝pyspider會遇到的4個大坑

 安裝pyspider

pip3 install pyspider

 

安裝完成後運行pyspider,會顯示async報錯(錯誤一)

~$ pyspider 
Traceback (most recent call last):
  File "/usr/local/bin/pyspider", line 5, in <module>
    from pyspider.run import main
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 231
    async=True, get_object=False, no_input=False):
        ^
SyntaxError: invalid syntax

因爲async爲pyton的關鍵字不能使用,所以需要把async替換爲別的 

根據錯誤信息進入相對應的路徑下如:/usr/local/lib/python3.7/site-packages/pyspider

使用ack命令進行查看,如果沒有ack可以進行安裝:brew install ack

~$ ack async
run.py
231:            async=True, get_object=False, no_input=False):
245:                      poolsize=poolsize, proxy=proxy, async=async)
365:        webui_fetcher = ctx.invoke(fetcher, async=False, get_object=True, no_input=True, **fetcher_config)

fetcher/tornado_fetcher.py
81:    def __init__(self, inqueue, outqueue, poolsize=100, proxy=None, async=True):
89:        self.async = async
95:        if self.async:
117:        if self.async:
118:            return self.async_fetch(task, callback)
120:            return self.async_fetch(task, callback).result()
123:    def async_fetch(self, task, callback=None):
155:            return self.ioloop.run_sync(functools.partial(self.async_fetch, task, lambda t, _, r: True))

webui/app.py
95:    'fetch': lambda x: tornado_fetcher.Fetcher(None, None, async=False).fetch(x),

紅色框裏面是需要修改的,將async替換爲async_

修改完後運行:~$ pyspider  

可能又報錯了 (錯誤二)

$ pyspider
phantomjs fetcher running on port 25555
[I 200209 17:01:34 result_worker:49] result_worker starting...
[I 200209 17:01:34 processor:211] processor starting...
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 236, in fetcher
    Fetcher = load_cls(None, None, fetcher_cls)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
  File "/usr/local/lib/python3.7/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
  File "/usr/local/lib/python3.7/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
    from .tornado_fetcher import Fetcher
  File "/usr/local/lib/python3.7/site-packages/pyspider/fetcher/tornado_fetcher.py", line 30, in <module>
    from tornado.curl_httpclient import CurlAsyncHTTPClient
  File "/usr/local/lib/python3.7/site-packages/tornado/curl_httpclient.py", line 24, in <module>
    import pycurl  # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)
[I 200209 17:01:34 scheduler:647] scheduler starting...
Traceback (most recent call last):
  File "/usr/local/bin/pyspider", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1114, in invoke
    return Command.invoke(self, ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 333, in webui
    app = load_cls(None, None, webui_instance)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 48, in load_cls
    return utils.load_object(value)
  File "/usr/local/lib/python3.7/site-packages/pyspider/libs/utils.py", line 369, in load_object
    module = __import__(module_name, globals(), locals(), [object_name])
  File "/usr/local/lib/python3.7/site-packages/pyspider/webui/__init__.py", line 8, in <module>
    from . import app, index, debug, task, result, login
  File "/usr/local/lib/python3.7/site-packages/pyspider/webui/app.py", line 17, in <module>
    from pyspider.fetcher import tornado_fetcher
  File "/usr/local/lib/python3.7/site-packages/pyspider/fetcher/__init__.py", line 1, in <module>
    from .tornado_fetcher import Fetcher
  File "/usr/local/lib/python3.7/site-packages/pyspider/fetcher/tornado_fetcher.py", line 30, in <module>
    from tornado.curl_httpclient import CurlAsyncHTTPClient
  File "/usr/local/lib/python3.7/site-packages/tornado/curl_httpclient.py", line 24, in <module>
    import pycurl  # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other)

這個是pycurl的問題  ImportError: pycurl: libcurl link-time ssl backend (openssl) is different from 

解決方法:

  ~$ pip3 uninstall pycurl

  ~$ brew install openssl

  ~$ echo 'export PATH="/usr/local/opt/[email protected]/bin:$PATH"' >> ~/.bash_profile 

    ps: 這步出現了 Permission denied 你敢信 我可是 sudo 啊, 如果出現了這種情況, 直接用 vim 將 export PATH="/usr/local/opt/[email protected]/bin:$PATH" 寫到 ~/.bash_profile 文件裏;

  ~$ export LDFLAGS="-L/usr/local/opt/[email protected]/lib"

  ~$ export CPPFLAGS="-I/usr/local/opt/[email protected]/include"

  ~$ export PYCURL_SSL_LIBRARY=openssl

  ~$ pip3 install pycurl --compile --no-cache-dir

 

再次運行~$ pyspider 

可能還有錯(錯誤三)

$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 200209 17:07:58 result_worker:49] result_worker starting...
[I 200209 17:07:58 processor:211] processor starting...
[I 200209 17:07:58 scheduler:647] scheduler starting...
[I 200209 17:07:58 tornado_fetcher:638] fetcher starting...
[I 200209 17:07:58 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 200209 17:07:58 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 200209 17:07:58 app:84] webui exiting...
Traceback (most recent call last):
  File "/usr/local/bin/pyspider", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 754, in main
    cli()
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1114, in invoke
    return Command.invoke(self, ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 165, in cli
    ctx.invoke(all)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 497, in all
    ctx.invoke(webui, **webui_config)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pyspider/run.py", line 384, in webui
    app.run(host=host, port=port)
  File "/usr/local/lib/python3.7/site-packages/pyspider/webui/app.py", line 59, in run
    from .webdav import dav_app
  File "/usr/local/lib/python3.7/site-packages/pyspider/webui/webdav.py", line 216, in <module>
    dav_app = WsgiDAVApp(config)
  File "/usr/local/lib/python3.7/site-packages/wsgidav/wsgidav_app.py", line 134, in __init__
    _check_config(config)
  File "/usr/local/lib/python3.7/site-packages/wsgidav/wsgidav_app.py", line 118, in _check_config
    raise ValueError("Invalid configuration:\n  - " + "\n  - ".join(errors))
ValueError: Invalid configuration:
  - Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.

 

ValueError: Invalid configuration:
  - Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.

此錯誤需要將/usr/local/lib/python3.7/site-packages/pyspider/webui/webdav.py文件中的'domaincontroller': NeedAuthController(app)修改爲'http_authenticator':{
        'HTTP_Authenticator': NeedAuthController(app),
    },

config.update({
    'mount_path': '/dav',
    'provider_mapping': {
        '/': ScriptProvider(app)
    },
    #'domaincontroller': NeedAuthController(app),#此條註釋掉 修改爲下方代碼
    'http_authenticator':{
        'HTTP_Authenticator': NeedAuthController(app),
    },

    'verbose': 1 if app.debug else 0,
    'dir_browser': {'davmount': False,
                    'enable': True,
                    'msmount': False,
                    'response_trailer': ''},
})

再次啓動 可能出現的錯誤(錯誤四)Error: Could not create web server listening on port 25555

~$ pyspider
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
[I 200209 17:13:34 result_worker:49] result_worker starting...
[I 200209 17:13:34 processor:211] processor starting...
[I 200209 17:13:34 tornado_fetcher:638] fetcher starting...
[I 200209 17:13:34 scheduler:647] scheduler starting...
[I 200209 17:13:34 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 200209 17:13:34 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 200209 17:13:34 app:76] webui running on 0.0.0.0:5000
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
Error: Could not create web server listening on port 25555
^C[I 200209 17:13:43 tornado_fetcher:671] fetcher exiting...
[I 200209 17:13:43 scheduler:663] scheduler exiting...
[I 200209 17:13:43 result_worker:66] result_worker exiting...
[I 200209 17:13:43 processor:229] processor exiting...
[I 200209 17:13:43 app:84] webui exiting...

這個錯誤是端口被佔用,將經常殺掉就行了

~$ lsof -i:25555
COMMAND     PID     USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
phantomjs 31631 XXXXX   11u  IPv4 0xdd790f1e6d58d86b      0t0  TCP *:25555 (LISTEN)
~$
~$
~$
~$
~$ kill 31631

啓動成功:

$ pyspider
phantomjs fetcher running on port 25555
[I 200209 17:14:20 result_worker:49] result_worker starting...
[I 200209 17:14:20 processor:211] processor starting...
[I 200209 17:14:20 tornado_fetcher:638] fetcher starting...
[I 200209 17:14:20 scheduler:647] scheduler starting...
[I 200209 17:14:20 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333
[I 200209 17:14:20 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0
[I 200209 17:14:20 app:76] webui running on 0.0.0.0:5000
[I 200209 17:15:20 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章