Best questions about python at stackoverflow in 2016

stackoverflow

1. 使用pythonic的方式避免“if x : return x”的語句。

Question:
I have a method that calls 4 other methods in sequence to check for specific conditions, and returns immediately (not checking the following ones) whenever one returns something Truthy.

def check_all_conditions():
    x = check_size()
    if x:
        return x
    x = check_color()
    if x:
        return x
    x = check_tone()
    if x:
        return x
    x = check_flavor()
    if x:
        return x
    return None

This seems like a lot of baggage code. Instead of each 2-line if statement, I’d rather do something like:

x and return x

But that is invalid Python. Am I missing a simple, elegant solution here? Incidentally, in this situation, those four check methods may be expensive, so I do not want to call them multiple times.
Answer:

  • 使用循環
conditions = (check_size, check_color, check_tone, check_flavor)
for condition in conditions:
    result = condition()
    if result:
        return result

或者,使用生成器表達式:

conditions = (check_size, check_color, check_tone, check_flavor)
checks = (condition() for condition in conditions)
return next((check for check in checks if check),None)
  • 使用or連接,返回第一個真值或者None:
def check_all_conditions():
    return check_size() or check_color() or check_tone() or check_flavor() or None

Demo:
這裏寫圖片描述

2. 如何理解python中的“else”子句?

Question:
Many Python programmers are probably unaware that the syntax of while loops and for loops includes an optional else: clause:

for val in iterable:
    do_something(val)
else:
    clean_up()

The body of the else clause is a good place for certain kinds of clean-up actions, and is executed on normal termination of the loop: I.e., exiting the loop with return or break skips the else clause; exiting after a continue executes it. I know this only because I just looked it up (yet again), because I can never remember when the else clause is executed.

Always? On “failure” of the loop, as the name suggests? On regular termination? Even if the loop is exited with return? I can never be entirely sure without looking it up.

I blame my persisting uncertainty on the choice of keyword: I find else incredibly unmnemonic for this semantics. My question is not “why is this keyword used for this purpose” (which I would probably vote to close, though only after reading the answers and comments), but how can I think about the else keyword so that its semantics make sense, and I can therefore remember it?

I’m sure there was a fair amount of discussion about this, and I can imagine that the choice was made for consistency with the try statement’s else: clause (which I also have to look up), and with the goal of not adding to the list of Python’s reserved words. Perhaps the reasons for choosing else will clarify its function and make it more memorable, but I’m after connecting name to function, not after historical explanation per se.

The answers to this question, which my question was briefly closed as a duplicate of, contain a lot of interesting back story. My question has a different focus (how to connect the specific semantics of else with the keyword choice), but I feel there should be a link to this question somewhere.
Answer:
一個 if 語句在條件爲假時候運行 else 子句,同樣地,一個while語句在其條件爲flase時運行其else子句。
This rule matches the behavior you described:

  • 在正常執行中, while 循環重複運行直至條件爲假,因此很自然的退出循環並進入 else 子句;
  • 當執行 break 語句時,會不經條件判斷直接退出循環,所以條件就不能爲假,也就永遠不會執行 else 子句;
  • 當執行 continue 語句時,會再次進行條件判斷,然後在循環迭代的開始處正常執行。所以如果條件爲真,就接着循環,如果條件爲假就運行 else 子句;
  • Other methods of exiting the loop, such as return, do not evaluate the condition and therefore do not run the else clause.(其他退出loop的方法,例如return語句,不會經過條件判斷,所以就不會經過else 子句)。

for loops behave the same way. Just consider the condition as true if the iterator has more elements, or false otherwise.

3. 如何避免 _init_中 “self.x = x; self.y = y; self.z = z” 這樣的模式?

def __init__(self, x, y, z):
    self.x = x
    self.y = y
    self.z = z

是否存在一種比較好的方法,避免這種參數初始化模式,應該繼承nemedtuple嗎?
Answer:
whatever,this is right!
以下是一些如何去避免這種情況的解決方案:

  • 針對只有關鍵字參數的情況,可簡單使用settat
class A:
    def __init__(self, **kwargs):
        for key in kwargs:
            setattr(self, key, kwargs[key])

a = A(l=1, n=2, m=0)
a.l # return 1
a.n # return 2
a.m # return o
  • 針對同時有位置參數和關鍵字參數,可採用裝飾器
import decorator
import inspect
import sys


@decorator.decorator
def simple_init(func, self, *args, **kws):
    """
    @simple_init
    def __init__(self,a,b,...,z)
        dosomething()

    behaves like

    def __init__(self,a,b,...,z)
        self.a = a
        self.b = b
        ...
        self.z = z
        dosomething()
    """

    #init_argumentnames_without_self = ['a','b',...,'z']
    if sys.version_info.major == 2:
        init_argumentnames_without_self = inspect.getargspec(func).args[1:]
    else:
        init_argumentnames_without_self = tuple(inspect.signature(func).parameters.keys())[1:]

    positional_values = args
    keyword_values_in_correct_order = tuple(kws[key] for key in init_argumentnames_without_self if key in kws)
    attribute_values = positional_values + keyword_values_in_correct_order

    for attribute_name,attribute_value in zip(init_argumentnames_without_self,attribute_values):
        setattr(self,attribute_name,attribute_value)

    # call the original __init__
    func(self, *args, **kws)


class Test():
    @simple_init
    def __init__(self,a,b,c,d=4):
        print(self.a,self.b,self.c,self.d)

#prints 1 3 2 4
t = Test(1,c=2,b=3)
#keeps signature
#prints ['self', 'a', 'b', 'c', 'd']
if sys.version_info.major == 2:
    print(inspect.getargspec(Test.__init__).args)
else:
    print(inspect.signature(Test.__init__))

**

4 . 爲什麼Python3中浮點值4*0.1看起來是對的,但是3*0.1則不然?

**
如圖

Answer:
The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an “exact” operation).
簡單地說,因爲由於量化(舍入)誤差的存在,3*0.1 != 0.3(而4*0.1 == 0.4是因爲2的冪的乘法通常是一個“精確的”操作)。

You can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what’s going on under the hood.
你可以在Python中使用 .hex 方法來查看數字的內部表示(基本上,是確切的二進制浮點值,而不是十進制的近似值)。 這可以幫助解釋下面發生了什麼。
這裏寫圖片描述
但是,當乘以3時,0x0.99和0x0.a0(0x0.07)之間的微小差異放大爲0x0.15的錯誤,在最後一個位置顯示爲一位錯誤。 這使得0.1*3大於整值0.3。
Python 3中浮點數的repr設計爲可以往返的,也就是說,顯示的值應該可以精確地轉換爲原始值。 因此,它不能以完全相同的方式顯示0.3和0.1 * 3,或者兩個不同的數字在往返之後是相同的。 所以,Python 3的repr引擎選擇顯示有輕微的有明顯錯誤的結果

5 . 當前如下的python代碼是否知道它的縮進嵌套級別?

print(get_indentation_level()print(get_indentation_level())
        print(get_indentation_level())

我想得到這樣的結果:

1
2
3

這段代碼是否能夠識別這種格式?
All I want is the output from the more nested parts of the code to be more nested. In the same way that this makes code easier to read, it would make the output easier to read.
我想要的是更多的嵌套部分的代碼的輸出更多的嵌套。 同樣地,使得代碼更容易閱讀,也使輸出更容易閱讀。

Of course I could implement this manually, using e.g. .format(), but what I had in mind was a custom print function which would print(i*’ ’ + string) where i is the indentation level. This would be a quick way to make readable output on my terminal.
當然,我可以用.format()手動實現,但我想到的是一個自定義 print 函數,它將print(i*’ ’ + string),其中i是縮進級別。這會是一個在終端中產生可讀輸出的快速方法。

Is there a better way to do this which avoids painstaking manual formatting?
有沒有更好的格式化方式?
Answer:
如果你想要嵌套級別的縮進,而不是空格和製表符,事情變得棘手。 例如,在下面的代碼中:

if True:
    print(
        get_nesting_level())

對get_nesting_level的調用實際上是嵌套1級,儘管事實上在get_nesting_level的調用行前沒有空格。同時,在下面的代碼中:

print(1,
      2,
      get_nesting_level())

6 爲什麼python的Array很慢?

我以爲 array.array 比 list 要快,因爲array看起來是未裝箱的(unboxed)。

然後,我得到了下面的結果:

In [1]: import array

In [2]: L = list(range(100000000))

In [3]: A = array.array('l', range(100000000))

In [4]: %timeit sum(L)
1 loop, best of 3: 667 ms per loop

In [5]: %timeit sum(A)
1 loop, best of 3: 1.41 s per loop

In [6]: %timeit sum(L)
1 loop, best of 3: 627 ms per loop

In [7]: %timeit sum(A)
1 loop, best of 3: 1.39 s per loop

造成這種問題的區別是什麼?
Answer:
The storage is “unboxed”, but every time you access an element Python has to “box” it (embed it in a regular Python object) in order to do anything with it. For example, your sum(A) iterates over the array, and boxes each integer, one at a time, in a regular Python int object. That costs time. In your sum(L), all the boxing was done at the time the list was created.
存儲是未裝箱的,但當每次訪問一個元素,Python都必須將其“裝箱”,一遍做其他操作,例如,sum(A) 遍歷了array,並且每次一個裝箱到Python的int對象中,這需要一定的時間,在sum(L)中,所有的“裝箱”操作都已在創建list的時候完成了。

So, in the end, an array is generally slower, but requires substantially less memory.
所以,數組普遍慢,但是實際上,佔用更少的內存。


Here’s the relevant code from a recent version of Python 3, but the same basic ideas apply to all CPython implementations since Python was first released.
Python3 最新版本的代碼如下,也可以用Cpython實現。
Here’s the code to access a list item:

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
{
    /* error checking omitted */
    return ((PyListObject *)op) -> ob_item[i];
}

here’s very little to it: somelist[i] just returns the i’th object in the list (and all Python objects in CPython are pointers to a struct whose initial segment conforms to the layout of a struct PyObject).

And here’s the getitem implementation for an array with type code l:

static PyObject *
l_getitem(arrayobject *ap, Py_ssize_t i)
{
    return PyLong_FromLong(((long *)ap->ob_item)[i]);
}

原始內存被視爲本地平臺的元素爲C long(長整型)的向量;第 i 個C long 被讀出;然後調用PyLong_FromLong() 將本地的C long 包裝(“裝箱”)成Python long 對象(在Python 3中,它消除了Python 2中 int 和 long 之間的區別,實際上顯示爲int)。

這個裝箱必須爲Python int對象分配新的內存,並將本地的C long的位寫入其中。在原例的上下文中,這個對象的生命週期非常短暫(只是足夠讓sum()將內容添加到總數中),然後需要更多的時間來釋放新的int對象。

這就是速度差異的來源,來自於CPython的實現。

7 . 乘以2比移位快嗎?

was looking at the source of sorted_containers and was surprised to see this line:

self._load, self._twice, self._half = load, load * 2, load >> 1

這裏的 load 是一個整數。 爲什麼在一個位置使用移位,在另一個位乘? 合理的解釋似乎是,比特移位可能比整數除以2快,但是爲什麼不用移位替換乘法呢? 我對以下情況做了基準測試:

  1. (乘法,除法)
  2. (移位,移位)
  3. (乘法,移位)
  4. (移位,除法)
    並發現3 始終比其他方式更快:
import pandas as pd
import timeit
import random

x = random.randint(10**3, 10**6)

def test_naive():
    a, b, c = x, 2 * x, x // 2

def test_shift():
    a, b, c = x, x << 1, x >> 1

def test_mixed():
    a, b, c = x, x * 2, x >> 1

def test_mixed_swaped():
    a, b, c = x, x << 1, x // 2

def observe(k):
    print(k)
    return {
        'naive': timeit.timeit(test_naive),
        'shift': timeit.timeit(test_shift),
        'mixed': timeit.timeit(test_mixed),
        'mixed_swapped': timeit.timeit(test_mixed_swaped),
    }

def get_observation():
    return pd.DataFrame([observe(k) for k in range(100)])

if __name__ == '__main__':
    get_observation()

Question:
我的測試有效嗎? 如果是,爲什麼(乘法,移位)比(移位,移位)快?我是在Ubuntu 14.04上運行Python 3.5。
以上是問題的原始聲明。 Dan Getz在他的回答中提供了一個很好的解釋。
Answer:
爲了完整性,以下是不應用乘法優化時,用更大x的示例說明。
這似乎是因爲小數字的乘法在CPython 3.5中得到優化,而小數字的左移則沒有。正左移總是創建一個更大的整數對象來存儲結果,作爲計算的一部分,而對於測試中使用的排序的乘法,特殊的優化避免了這一點,並創建了正確大小的整數對象。這可以在 Python的整數實現的源代碼 中看到。

因爲Python中的整數是任意精度的,所以它們被存儲爲整數“數字(digits)”的數組,每個整數數字的位數有限制。所以在一般情況下,涉及整數的操作不是單個操作,而是需要處理多個“數字”。在 pyport.h 中,該位限制在64位平臺上 定義爲 30位,其他的爲15位。 (這裏我將使用30,以使解釋簡單。但是請注意,如果你使用的Python編譯爲32位,你的基準的結果將取決於如果 x 是否小於32,768。

static PyObject *
long_mul(PyLongObject *a, PyLongObject *b)
{
    PyLongObject *z;

    CHECK_BINOP(a, b);

     / *單位乘法的快速路徑* /
    if (Py_ABS(Py_SIZE(a)) <= 1 && Py_ABS(Py_SIZE(b)) <= 1) {
        stwodigits v = (stwodigits)(MEDIUM_VALUE(a)) * MEDIUM_VALUE(b);
#ifdef HAVE_LONG_LONG
        return PyLong_FromLongLong((PY_LONG_LONG)v);
#else
        / *如果沒有long long,我們幾乎肯定
           使用15位數字,所以 v 將適合 long。在
           不太可能發生的情況中,沒有long long,
           我們在平臺上使用30位數字,一個大 v 
           會導致我們使用下面的一般乘法代碼。 * /
        if (v >= LONG_MIN && v <= LONG_MAX)
            return PyLong_FromLong((long)v);
#endif
    }

因此,當乘以兩個整數(每個整數適用於30位數字)時,這會由CPython解釋器進行的直接乘法,而不是將整數作爲數組。(對一個正整數對象調用的 MEDIUM_VALUE() 會得到其前30位數字。)如果結果符合一個30位數字, PyLong_FromLongLong() 將在相對較少的操作中注意到這一點,並創建一個單數字整數對象來存儲它。

相反,左移位不是這樣優化的,每次左移位會把整數當做一個數組來處理。特別地,如果你閱讀 long_lshift() 的源碼,在一個小且正的左移位的情況下,如果只需把它的長度截斷成1,總會創建一個2位數的整數對象:

static PyObject *
long_lshift(PyObject *v, PyObject *w)
{
    /*** ... ***/

    wordshift = shiftby / PyLong_SHIFT;   /*** 對於小w,是0 ***/
    remshift  = shiftby - wordshift * PyLong_SHIFT;   /*** 對於小w,是w ***/

    oldsize = Py_ABS(Py_SIZE(a));   /*** 對於小v > 0,是1 ***/
    newsize = oldsize + wordshift;
    if (remshift)
        ++newsize;   /*** 對於 w > 0, v > 0,newsize至少會變成2 ***/
    z = _PyLong_New(newsize);

    /*** ... ***/
}

整數除法
你沒有問整數整除相比於右位移哪種性能更差,因爲這符合你(和我)的期望。但是將小的正數除以另一個小的正數並不像小乘法那樣優化。每個 // 使用函數 long_divrem() 計算商和餘數。這個餘數是通過小除數的 乘法 得到的,並存儲在新分配的整數對象中。在這種情況下,它會立即被丟棄。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章