-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance investigation #37
Comments
Ah, okay, so i think this is a clue... For some reason, even simple assignment is calling julia> x = FixedPointDecimals.FixedDecimal{Int128,2}(3)
FixedDecimal{Int128,2}(3.00)
julia> @code_warntype Ref(x)[] = FixedPointDecimals.FixedDecimal{Int128,2}(5)
Body::Base.RefValue{FixedDecimal{Int128,2}}
33 1 ─ goto #3 if not false │
2 ─ nothing │
3 ─ %3 = invoke Base.convert(FixedDecimal{Int128,2}::Type{FixedDecimal{Int128,2}}, _3::FixedDecimal{Int128,2})::FixedDecimal{Int128,2} │╻ setproperty!
│ (Base.setfield!)(b, :x, %3) ││
└── return b │ And then, julia> @code_warntype Base.convert(FixedPointDecimals.FixedDecimal{Int128,2}, FixedPointDecimals.FixedDecimal{Int128,2}(3))
Body::FixedDecimal{Int128,2}
282 1 ─ (Base.checked_sdiv_int)(100, 100) │╻ div
283 │ %2 = (Base.getfield)(x, :i)::Int128 │╻ getproperty
│ %3 = invoke BigInt(%2::Int128)::BigInt ││╻╷ widen
│ (Base.slt_int)(1, 0) │││╻╷╷╷ convert
│ (Base.slt_int)(1, 0) ││││╻╷╷ Type
│ (Base.ule_int)(0x00000000000000000000000000000001, 0x0000000000000000ffffffffffffffff) │││││╻ <=
│ %7 = Base.GMP.MPZ.set_ui::typeof(Base.GMP.MPZ.set_ui) │││││╻ Type
│ %8 = invoke %7(0x0000000000000001::UInt64)::BigInt ││││││
│ %9 = Base.GMP.MPZ.mul::typeof(Base.GMP.MPZ.mul) ││╻ *
│ %10 = invoke %9(%3::BigInt, %8::BigInt)::BigInt │││
│ %11 = invoke $(Expr(:static_parameter, 1))(%10::BigInt)::Int128 │
│ %12 = %new(FixedDecimal{Int128,2}, %11)::FixedDecimal{Int128,2} │╻ reinterpret
└── return %12 │
2 ─ $(Expr(:unreachable)) │ The problem is it's calling this convert here:
julia> @which Base.convert(FixedPointDecimals.FixedDecimal{Int128,2}, FixedPointDecimals.FixedDecimal{Int128,2}(3))
convert(::Type{FixedDecimal{T,f}}, x::FixedDecimal{U,g}) where {T, f, U, g} in FixedPointDecimals at /Users/nathan.daly/.julia/dev/FixedPointDecimals/src/FixedPointDecimals.jl:289 So I think the solution is add an overload for where the types are the same. I'll try that and post back. |
Ah, okay this issue exactly describes the problem. We're missing a |
The only |
Ah, actually, no i think it was coming from this widemul: FixedPointDecimals.jl/src/FixedPointDecimals.jl Lines 288 to 292 in d87debb
Because In this case, where the types are identical, no conversion was necessary. It may also be worth it to investigate if we can make other simplifications when |
Hi again! So we've been poking at this for the last couple weeks, and we've got some significant performance improvements! :) First, though, we've improved the benchmarks we're using to judge these changes. I think we've managed to cut out any of the extra costs that were in the old benchmark (like iterating through an array), so that we can show just the cost of each operation. Here's the new file: And here are the results for
With the following caveat: What would you all think about merging some version of this benchmark file into this repo? It would be nice to have it running automatically and, ideally, detecting regressions between commits. With something like this, where performance is a feature, it's nice to be able to make informed statements about it! |
Splitting out the benchmark discussion from #36.
Here's the benchmark in a gist:
https://gist.github.com/NHDaly/a8fae0d1d65ab1066c585c27e54146fa
And the results in a google spreadsheet:
https://docs.google.com/spreadsheets/d/1Lc3ughgwwK25cpwbnuLRxw19EtaMxdCsMmtnspT_4H8/edit?usp=sharing
Here are the results:
Here are my current question:
FixedDecimal{Int128, 2}
into another array, allocates like crazy. (14,000,000 allocations / 1,000,000 elements)./
and*
ofFixedDecimal{Int64, 2}
are more expensive than forFixedDecimal{Int32, 2}
., by a factor of around 1.5x each, whereas/
and*
forInt64
andInt32
are almost identical.Int128
during those operations (due towidemul
), which seems to be slower thanInt64
across the board./
forInt128
is like 6x slower than forInt32
! Where does that come from?The text was updated successfully, but these errors were encountered: