用逗號分割並在Python中去除空格

本文翻譯自:Split by comma and strip whitespace in Python

I have some python code that splits on comma, but doesn't strip the whitespace: 我有一些在逗號處分割的python代碼,但沒有去除空格:

>>> string = "blah, lots  ,  of ,  spaces, here "
>>> mylist = string.split(',')
>>> print mylist
['blah', ' lots  ', '  of ', '  spaces', ' here ']

I would rather end up with whitespace removed like this: 我寧願這樣刪除空格:

['blah', 'lots', 'of', 'spaces', 'here']

I am aware that I could loop through the list and strip() each item but, as this is Python, I'm guessing there's a quicker, easier and more elegant way of doing it. 我知道我可以遍歷list和strip()每個項目,但是,因爲這是Python,所以我猜有一種更快,更輕鬆和更優雅的方法。


#1樓

參考:https://stackoom.com/question/H59g/用逗號分割並在Python中去除空格


#2樓

Split using a regular expression. 使用正則表達式拆分。 Note I made the case more general with leading spaces. 注意我用前導空格使情況更一般。 The list comprehension is to remove the null strings at the front and back. 列表理解是刪除前面和後面的空字符串。

>>> import re
>>> string = "  blah, lots  ,  of ,  spaces, here "
>>> pattern = re.compile("^\s+|\s*,\s*|\s+$")
>>> print([x for x in pattern.split(string) if x])
['blah', 'lots', 'of', 'spaces', 'here']

This works even if ^\\s+ doesn't match: 即使^\\s+不匹配也可以:

>>> string = "foo,   bar  "
>>> print([x for x in pattern.split(string) if x])
['foo', 'bar']
>>>

Here's why you need ^\\s+: 這就是您需要^ \\ s +的原因:

>>> pattern = re.compile("\s*,\s*|\s+$")
>>> print([x for x in pattern.split(string) if x])
['  blah', 'lots', 'of', 'spaces', 'here']

See the leading spaces in blah? 看到等等的主要空間嗎?

Clarification: above uses the Python 3 interpreter, but results are the same in Python 2. 說明:上面使用的是Python 3解釋器,但結果與Python 2相同。


#3樓

I came to add: 我來補充:

map(str.strip, string.split(','))

but saw it had already been mentioned by Jason Orendorff in a comment . 但是看到Jason Orendorff在評論中已經提到了它。

Reading Glenn Maynard's comment in the same answer suggesting list comprehensions over map I started to wonder why. 在同一個答案中讀到格倫·梅納德(Glenn Maynard)的評論,這暗示了人們對地圖的理解,我開始懷疑爲什麼。 I assumed he meant for performance reasons, but of course he might have meant for stylistic reasons, or something else (Glenn?). 我以爲他是出於性能方面的考慮,但是當然他可能是出於風格方面的原因,或者其他原因(Glenn?)。

So a quick (possibly flawed?) test on my box applying the three methods in a loop revealed: 因此,在我的盒子上快速地(可能有缺陷?)應用了以下三種方法的測試:

[word.strip() for word in string.split(',')]
$ time ./list_comprehension.py 
real    0m22.876s

map(lambda s: s.strip(), string.split(','))
$ time ./map_with_lambda.py 
real    0m25.736s

map(str.strip, string.split(','))
$ time ./map_with_str.strip.py 
real    0m19.428s

making map(str.strip, string.split(',')) the winner, although it seems they are all in the same ballpark. 使map(str.strip, string.split(','))成爲贏家,儘管看起來他們都在同一個球場。

Certainly though map (with or without a lambda) should not necessarily be ruled out for performance reasons, and for me it is at least as clear as a list comprehension. 當然,出於性能原因,不一定要排除map(有或沒有lambda),對我來說,它至少與列表理解一樣清晰。

Edit: 編輯:

Python 2.6.5 on Ubuntu 10.04 Ubuntu 10.04上的Python 2.6.5


#4樓

s = 'bla, buu, jii'

sp = []
sp = s.split(',')
for st in sp:
    print st

#5樓

re (as in regular expressions) allows splitting on multiple characters at once: re (如正則表達式中)允許一次拆分多個字符:

$ string = "blah, lots  ,  of ,  spaces, here "
$ re.split(', ',string)
['blah', 'lots  ', ' of ', ' spaces', 'here ']

This doesn't work well for your example string, but works nicely for a comma-space separated list. 這對於您的示例字符串而言效果不佳,但對於逗號分隔的列表則效果很好。 For your example string, you can combine the re.split power to split on regex patterns to get a "split-on-this-or-that" effect. 對於您的示例字符串,您可以結合使用re.split功能來分割正則表達式模式,以獲得“此或該分割”效果。

$ re.split('[, ]',string)
['blah',
 '',
 'lots',
 '',
 '',
 '',
 '',
 'of',
 '',
 '',
 '',
 'spaces',
 '',
 'here',
 '']

Unfortunately, that's ugly, but a filter will do the trick: 不幸的是,這很醜陋,但是filter可以解決問題:

$ filter(None, re.split('[, ]',string))
['blah', 'lots', 'of', 'spaces', 'here']

Voila! 瞧!


#6樓

import re
result=[x for x in re.split(',| ',your_string) if x!='']

this works fine for me. 這對我來說很好。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章