=C2=A0=C2=A0 =C2=A0 =C2=A0for key in input:
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0v =3D weights=5Bkey=5D
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sum=5F +=3D v
I'm working on an NLP and I got bitten by an unreasonably slow behaviour in Python while operating with small amounts of numbers.
I have the following code:
```python
import random, time
from functools import reduce
def trainPerceptron(perceptron, data):
  learningRate = 0.002
  weights = perceptron['weights']
  error = 0
  for chunk in data:
    input = chunk['input']
    output = chunk['output']
    # 12x slower than equivalent JS
    sum_ = 0
    for key in input:
      v = weights[key]
      sum_ += v
    # 20x slower than equivalent JS
    #sum_ = reduce(lambda acc, key: acc + weights[key], input)
    actualOutput = sum_ if sum_ > 0 else 0
    expectedOutput = 1 if output == perceptron['id'] else 0
    currentError = expectedOutput - actualOutput
    if currentError:
      error += currentError ** 2
      change = currentError * learningRate
      for key in input:
        weights[key] += change
On 2023-03-14 16:48:24 +0900, Alexander Nestorov wrote:
I'm working on an NLP and I got bitten by an unreasonably slow
behaviour in Python while operating with small amounts of numbers.
I have the following code:[...]
# 12x slower than equivalent JS
sum_ = 0
for key in input:
v = weights[key]
sum_ += v
# 20x slower than equivalent JS
#sum_ = reduce(lambda acc, key: acc + weights[key], input)
Not surprising. Modern JavaScript implementations have a JIT compiler. CPython doesn't.
You may want to try PyPy if your code uses tight loops like that.
Or alternatively it may be possible to use numpy to do these operations.
I'm working on an NLP and I got bitten by an unreasonably slow[...]
behaviour in Python while operating with small amounts of numbers.
I have the following code:
# 12x slower than equivalent JS
sum_ = 0
for key in input:
v = weights[key]
sum_ += v
# 20x slower than equivalent JS
#sum_ = reduce(lambda acc, key: acc + weights[key], input)
I'm working on an NLP and I got bitten by an unreasonably slow behaviour in Python while operating with small amounts of numbers.
I have the following code:
```python
import random, time
from functools import reduce
def trainPerceptron(perceptron, data):
learningRate = 0.002
weights = perceptron['weights']
error = 0
for chunk in data:
input = chunk['input']
output = chunk['output']
# 12x slower than equivalent JS
sum_ = 0
for key in input:
v = weights[key]
sum_ += v
I'm not quite sure why the built-in sum functions are slower than the for loop,
or why they're slower with the generator expression than with the list comprehension.
On Thu, 16 Mar 2023 at 01:26, David Raymond <David.Raymond@tomtom.com> wrote:
I'm not quite sure why the built-in sum functions are slower than the for loop,
or why they're slower with the generator expression than with the list comprehension.
For small-to-medium data sizes, genexps are slower than list comps,
but use less memory. (At some point, using less memory translates
directly into faster runtime.) But even the sum-with-genexp version is notably faster than reduce.
Is 'weights' a dictionary? You're iterating over it, then subscripting
every time. If it is, try simply taking the sum of weights.values(),
as this should be significantly faster.
Or use the sum() builtin rather than reduce(), which was
*deliberately* removed from the builtins. The fact that you can get
sum() without importing, but have to go and reach for functools to get
reduce(), is a hint that you probably shouldn't use reduce when sum
will work.
Out of curiosity I tried a couple variations and am a little confused by the results. Maybe I'm having a brain fart and am missing something obvious?
Each of these was run with the same "data" and "perceptrons" values to keep that fair.
Times are averages over 150 iterations like the original.
The only thing changed in the trainPerceptron function was how to calculate sum_
Original:
sum_ = 0
for key in input:
v = weights[key]
sum_ += v
418ms
The reduce version:
sum_ = reduce(lambda acc, key: acc + weights[key], input)
758ms
Getting rid of the assignment to v in the original version:
sum_ = 0
for key in input:
sum_ += weights[key]
380ms
But then using sum seems to be slower
sum with generator expression:
sum_ = sum(weights[key] for key in input)
638ms
sum with list comprehension:
sum_ = sum([weights[key] for key in input])
496ms
math.fsum with generator expression:
sum_ = math.fsum(weights[key] for key in input)
618ms
math.fsum with list comprehension:
sum_ = math.fsum([weights[key] for key in input])
480ms
I'm not quite sure why the built-in sum functions are slower than the for loop,
or why they're slower with the generator expression than with the list comprehension.
for key in input:
sum_ += weights[key]
Or use the sum() builtin rather than reduce(), which was
*deliberately* removed from the builtins. The fact that you can get
sum() without importing, but have to go and reach for functools to get reduce(), is a hint that you probably shouldn't use reduce when sum
will work.
Then I'm very confused as to how things are being done, so I will shut
up. There's not enough information here to give performance advice
without actually being a subject-matter expert already.
On 3/15/2023 11:01 AM, Chris Angelico wrote:
On Thu, 16 Mar 2023 at 01:26, David Raymond <David.Raymond@tomtom.com> wrote:
I'm not quite sure why the built-in sum functions are slower than the for loop,
or why they're slower with the generator expression than with the list comprehension.
For small-to-medium data sizes, genexps are slower than list comps,
but use less memory. (At some point, using less memory translates
directly into faster runtime.) But even the sum-with-genexp version is notably faster than reduce.
Is 'weights' a dictionary? You're iterating over it, then subscripting every time. If it is, try simply taking the sum of weights.values(),
as this should be significantly faster.
It's a list.
Then I'm very confused as to how things are being done, so I will shut
up. There's not enough information here to give performance advice
without actually being a subject-matter expert already.
I have the following code:Nothing to do with your actual question and it's probably just a small oversight, but still I thought it was worth a mention: that comment does
...
for i in range(151): # 150 iterations
  ...
Sum is faster than iteration in the general case.
def sum1():
s = 0
for i in range(1000000):
s += i
return s
def sum2():
return sum(range(1000000))
def sum1():
s = 0
for i in range(1000000):
s += i
return s
def sum2():Here you already have the numbers you want to add.
return sum(range(1000000))
def sum1():Here you already have the numbers you want to add.
s = 0
for i in range(1000000):
s += i
return s
def sum2():
return sum(range(1000000))
Actually using numpy you'll be much faster in this case:
§ import numpy as np
§ def sum3():
§ return np.arange(1_000_000, dtype=np.int64).sum()
On my computer sum1 takes 44 ms, while the numpy version just 2.6 ms
One problem is that sum2 gives the wrong result. This is why I used np.arange with dtype=np.int64.
sum2 evidently doesn't uses the python "big integers" e restrict the result to 32 bits.
def sum1():Here you already have the numbers you want to add.
s = 0
for i in range(1000000):
s += i
return s
def sum2():
return sum(range(1000000))
Actually using numpy you'll be much faster in this case:
§ import numpy as np
§ def sum3():
§ return np.arange(1_000_000, dtype=np.int64).sum()
On my computer sum1 takes 44 ms, while the numpy version just 2.6 ms
One problem is that sum2 gives the wrong result. This is why I used np.arange with dtype=np.int64.
sum2 evidently doesn't uses the python "big integers" e restrict the result to 32 bits.
One problem is that sum2 gives the wrong result.
5050sum(range(101))
5050numpy.arange(101, dtype=numpy.int64).sum()
...for i in range(10): print(i)
On 3/20/2023 11:21 AM, Edmondo Giovannozzi wrote:
def sum1():Here you already have the numbers you want to add.
s = 0
for i in range(1000000):
s += i
return s
def sum2():
return sum(range(1000000))
Actually using numpy you'll be much faster in this case:
§ import numpy as np
§ def sum3():
§ return np.arange(1_000_000, dtype=np.int64).sum()
On my computer sum1 takes 44 ms, while the numpy version just 2.6 msOn my computer they all give the same result.
One problem is that sum2 gives the wrong result. This is why I used np.arange with dtype=np.int64.
Python 3.10.9, PyQt version 6.4.1
Windows 10 AMD64 (build 10.0.19044) SP0
Processor: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz, 1690 Mhz, 4 Core(s), 8 Logical Processor(s)
sum2 evidently doesn't uses the python "big integers" e restrict the result to 32 bits.What about your system? Let's see if we can figure the reason for the difference.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 68:44:24 |
Calls: | 6,712 |
Files: | 12,244 |
Messages: | 5,356,557 |