关于使用PyCharm远程调试运行时StanfordCoreNLP报无法找到java的问题解决

关于使用PyCharm远程调试运行时StanfordCoreNLP报无法找到java的问题解决

最近学习NLP,在PyCharm配置好了远程调试运行,在使用stanfordcorenlp的时候报错FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java',原本以为可以和上一篇文章《关于pyhanlp报FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/jvm'错误的解决》一样,添加环境变量即可,但无济于事。网上也没有查到类似的错误,看来是我jdk的安装比较奇葩?

报错详情:

ssh://yl@IP:PORT/home/USER/anaconda3/envs/tensorflow/bin/python -u /home/yl/python/nlp/learing/test01.py
Traceback (most recent call last):
  File "/home/yl/python/nlp/learing/test01.py", line 7, in <module>
    snlp = StanfordCoreNLP(os.sep + 'opt' + os.sep + "nlp" + os.sep + 'stanford-corenlp', lang='zh')
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/site-packages/stanfordcorenlp/corenlp.py", line 46, in __init__
    if not subprocess.call(['java', '-version'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) == 0:
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 323, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 774, in __init__
    restore_signals, start_new_session)
  File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 1522, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java'

原因分析:

老办法,网上找不到答案,慢慢看源代码找原因吧。打开文件/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py找到_execute_child()这个函数,并在1522行附近有如下代码

                if issubclass(child_exception_type, OSError) and hex_errno:
                    errno_num = int(hex_errno, 16)
                    child_exec_never_called = (err_msg == "noexec")
                    if child_exec_never_called:
                        err_msg = ""
                        # The error must be from chdir(cwd).
                        err_filename = cwd
                    else:
                        err_filename = orig_executable
                    if errno_num != 0:
                        err_msg = os.strerror(errno_num)
                        if errno_num == errno.ENOENT:
                            err_msg += ': ' + repr(err_filename)
                    raise child_exception_type(errno_num, err_msg, err_filename)
                

可以看到当errno_no不为0的时候报错,依次向上查看,可以看到错误来源 errno_no -> hex_errno -> errpipe_data -> errpipe_read -> self.pid = _posixsubprocess.fork_exec()执行时产生(1452行左右),从该函数输入参数名来看,应该是executable_list和env_list影响了是否能找到java位置。于是在executable_list生成附近print了查看其变化情况,如下(1436行至1442行)

                    executable = os.fsencode(executable)
                    print('executable: ', executable)    # 打印 从上面传入的初始值
                    if os.path.dirname(executable):
                        executable_list = (executable,)
                    else:
                        # This matches the behavior of os._execvpe().
                        print('env: ', env)    # 打印 env
                        print('get_exec_path of env: ', os.get_exec_path(env))  # 打印 从env获取系统可执行路径 应该是 PATH 变量
                        executable_list = tuple(
                            os.path.join(os.fsencode(dir), executable)
                            for dir in os.get_exec_path(env))
                        print('executable_list: ', executable_list)    # 打印 最终的路径结果

PyCharm中导入stanfordcorenlp执行StanfordCoreNLP时输出如下:

executable:  b'java'
env:  None
get_exec_path of env:  ['/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/games', '/usr/local/games']
executable_list:  (b'/usr/local/sbin/java', b'/usr/local/bin/java', b'/usr/sbin/java', b'/usr/bin/java', b'/sbin/java', b'/bin/java', b'/usr/games/java', b'/usr/local/games/java')
Traceback (most recent call last):

在linux终端中运行输出如下:

yl@ylhome [20:24:07] ~$ /home/yl/anaconda3/envs/tensorflow/bin/python
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> from stanfordcorenlp import StanfordCoreNLP 
>>> stanford_nlp = StanfordCoreNLP(os.sep + '/opt' + os.sep + "/nlp" + os.sep + '/stanford-corenlp', lang='zh')
executable:  b'java'
env:  None
get_exec_path of env:  ['/home/yl/.local/bin', '/home/yl/bin', '/usr/local/cuda/bin', '/home/yl/anaconda3/bin', '/usr/local/java/latest/bin', '/usr/local/cuda/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/games', '/usr/local/games', '/snap/bin']
executable_list:  (b'/home/yl/.local/bin/java', b'/home/yl/bin/java', b'/usr/local/cuda/bin/java', b'/home/yl/anaconda3/bin/java', b'/usr/local/java/latest/bin/java', b'/usr/local/cuda/bin/java', b'/usr/local/sbin/java', b'/usr/local/bin/java', b'/usr/sbin/java', b'/usr/bin/java', b'/sbin/java', b'/bin/java', b'/usr/games/java', b'/usr/local/games/java', b'/snap/bin/java')
executable:  b'/bin/sh'
>>> 

命令行中读取到的PATH的值是正确的,PyCharm远程调用时无法获取用户自行添加的PATH。那么,一个便捷的方式是将java链接到PyCharm调用时能读取到的位置,如/usr/local/bin中。

具体内在的原因,由于时间匆忙就不予深究了,暂时解决问题以后再来回顾。

解决办法:

将java命令链接到系统默认的可执行目录,如/usr/bin或/usr/local/bin等地方。我的配置:

sudo ln -sf /usr/local/java/latest/bin/java /usr/local/bin/java

运行效果:

然后在PyCharm中运行stanfordcorenlp包,可以正常运行

from stanfordcorenlp import StanfordCoreNLP
import os

snlp = StanfordCoreNLP(os.sep + 'opt' + os.sep + "nlp" + os.sep + 'stanford-corenlp', lang='zh')

str = '今天晚上吃火锅啊!'
print(snlp.ner(str))

结果:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章