Refer to indices by cytnx_int64 or cytnx_uint64 #444

manuschneider · 2024-08-20T17:25:51Z

When referring to positive numbers like index or block numbers, the data structure is either a cytnx_int64 or a cytnx_uint64.
For example, UniTensor.permute(std::vector<cytnx_int64>) and UniTensor.at(std::vector<cytnx_uint64>).
For UniTensor.get_block() both versions exist and the difference (if any) is not obvious to me.

This is a bit confusing and inconstant. While it should not cause problems on the Python side, the mix leads to difficulties with C++ because the two types are not automatically converted, at least when combined in vectors.

Is there a reason for different conventions? Could everything be changed to one convention? How can this be done without breaking compatibility with existing code, tests, etc?

The text was updated successfully, but these errors were encountered:

kaihsin · 2024-08-21T03:56:43Z

There is. Originally some of the API are design for large Tensor. Meaning that if you don't need to support -1 for indexing like python, the C++ side can utilize full 64bit address indexing. the API that using int64 is only to make compatible with Python -1 indexing

manuschneider · 2024-08-21T08:23:28Z

I see. But perumte does not work with negative indices anyways in the current implementation, at least for BlockUniTensor

yingjerkao · 2024-08-29T03:32:28Z

There is. Originally some of the API are design for large Tensor. Meaning that if you don't need to support -1 for indexing like python, the C++ side can utilize full 64bit address indexing. the API that using int64 is only to make compatible with Python -1 indexing

Can we do it at Pybind11 level to make C++ code consistent?

IvanaGyro · 2024-09-01T03:17:10Z

Here is a crop from the Google C++ style guide that we apply. In short, only use unsigned integers when representing a bit pattern and use size_t and ptrdiff_t when appropriate.

Unsigned integers are good for representing bitfields and modular arithmetic. Because of historical accident, the C++ standard also uses unsigned integers to represent the size of containers - many members of the standards body believe this to be a mistake, but it is effectively impossible to fix at this point. The fact that unsigned arithmetic doesn't model the behavior of a simple integer, but is instead defined by the standard to model modular arithmetic (wrapping around on overflow/underflow), means that a significant class of bugs cannot be diagnosed by the compiler. In other cases, the defined behavior impedes optimization.

That said, mixing signedness of integer types is responsible for an equally large class of problems. The best advice we can provide: try to use iterators and containers rather than pointers and sizes, try not to mix signedness, and try to avoid unsigned types (except for representing bitfields or modular arithmetic). Do not use an unsigned type merely to assert that a variable is non-negative.

ianmccul · 2024-09-01T05:37:00Z

I suggest use int (or some other signed type) for everything. But note that size_t is unsigned, so don't use that! use ssize_t.

C++20 has std::ssize() function for containers, so it is safe to write eg
Container<T> v; for (auto i = std::ssize(v)-1; i > 0; --i) { ... }

The MPToolkit is currently C++14 and not ready to use C++20 yet (not enough compiler support), but meanwhile I'm just going to add some ssize() function myself.

In new code, my feeling is always use a signed integer, unless it is actually bit-flipping. Having a container using size_type as a signed type will work fine, and if you mix it with standard containers it will harmlessly convert to an unsigned type. But always use std::ssize() when getting the size of containers when it needs to work with std containers as well, and similarly wrap other existing uses of unsigned types.

See also:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf

Bjarne also advocates for making all .size() functions return a signed int: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1491r0.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refer to indices by cytnx_int64 or cytnx_uint64 #444

Refer to indices by cytnx_int64 or cytnx_uint64 #444

manuschneider commented Aug 20, 2024 •

edited

Loading

kaihsin commented Aug 21, 2024

manuschneider commented Aug 21, 2024

yingjerkao commented Aug 29, 2024

IvanaGyro commented Sep 1, 2024 •

edited

Loading

ianmccul commented Sep 1, 2024

Refer to indices by cytnx_int64 or cytnx_uint64 #444

Refer to indices by cytnx_int64 or cytnx_uint64 #444

Comments

manuschneider commented Aug 20, 2024 • edited Loading

kaihsin commented Aug 21, 2024

manuschneider commented Aug 21, 2024

yingjerkao commented Aug 29, 2024

IvanaGyro commented Sep 1, 2024 • edited Loading

ianmccul commented Sep 1, 2024

manuschneider commented Aug 20, 2024 •

edited

Loading

IvanaGyro commented Sep 1, 2024 •

edited

Loading