Releases: jina-ai/clip-as-service
Releases · jina-ai/clip-as-service
Release v1.8.1
Highlights
- support fp16
- add build-in http server
- add concurrent BertClient
- fix python2 compatibility
- support show tokenization result to client
Improvements
- fix benchmark
- fix RTD generation
Release v1.7.0
Highlights
- Now support Windows! fixing logger serialization issue
- Add dashboard for monitoring the service in real-time
Improvements
- Fix timeout and zmq.linger problem on the client side
- Add a new server option
-mask_cls_sep
for masking [CLS] and [SEP] before pooling - Fix communication logic to avoid KeyError on the server
- Fix readme, documentation
Release v1.6.0
Highlights
- Optimize for concurrency of multiple clients. Using multiple sockets between ventilator and workers, this prevents a super large job from a single client would queue up the worker making other clients hanging forever.
- With new
-device_map
, you can specify which GPU you want to run on, and even mixing GPU/CPU workloads
Improvements
- Add
timeout
to BertClient, would raise TimeoutError when server is not online - Refactor device_map on the server side, fixing bugs
- Use zmq decorator to improve shutdown behavior
- Add cosine distance explanation to FAQ
- Update README for better readability
Release v1.5.5
Highlights
- BERT graph is first freezed, optimized and stored as a self-contained file, then workers load from it (like model-export in tf-serving)
- Customized tokenization is now supported
- refactoring the
BertWorker
to make it more multiprocess-friendly - fix python2 pip install bug and encoding bug on BertClient.
Improvements
- gather all zmq temp file in one place
- improve the type check on the client side
- add flag for cpu/gpu support
- add flag for XLA support, seems no improvement
Release v1.5
bert-as-service
is now available on PyPI 🎆! From now on you can simply update the package by
pip install -U bert-serving-server bert-serving-client
No need to copy paste the client code again!
Highlights
- fix async scheduling on the server-side #105 #101
- fix masking in REDUCE_MEAN, REDUCE_MAX, and REDUCE_MEAN_MAX #93
- add flag to support switching between GPU/CPU #111 #108
- fix concurrent client building issue #110 #60
- fix server hangs due to slow-joiner #110
Improvements
- add visualization examples
- improve readme by giving more screenshots and examples
- fix path error in examples
- add version check for client and server
- refactor client handshake logic
- update docker commands
Release v1.4
Highlights
- Add masking in
REDUCE_MEAN
,REDUCE_MAX
,REDUCE_MEAN_MAX
. This shall affect the most whenmax_seq_len
is much bigger than the actual sequence length from clients - Reduce latency significantly, improve speed by 20%, checkout new benchmark
- Refactor async encoding, client needs to be upgraded to use the new version
Improvements
- Reformat server logging
- Update benchmark table and plot
- Update readme
Release v1.3
Highlights
- Add classification examples
- Restrict GPU memory usage
Improvements
- More comprehensive README
- Add more examples
- Add benchmark for pooling_layers
Release v1.2
Highlights:
- fix random C-level assert error due to multi-thread in
BertServer
- redesign the message flow in
BertServer
and remove all back-chatter - server now opens two ports, one for pushing textual, the other for publishing encoded vector
Improvements:
- fix/add more examples
- fix figure in README.md
- using JSON as serialization everywhere
Release v1.1
Highlights
- Support output of word embedding (by setting
pooling_strategy=NONE
) - Support different pooling strategies
Improvements
- More comprehensive README
- fix logger
- fix wrong order problem when client_batch_size is large
- fix conflict ipc address when start multiple server instances
Release v1.0
Highlights
- Refactor the server-side pipeline and job scheduling, improve the scalability and reduce the latency significantly.
- Optimize the serialization of numpy array between sockets
- Add more exhaustive benchmark results.
Improvements
- Client will show server configuration when first connect.
- Better logging per worker per module
- Fix typo and rich the content in README
- Add dockerfile contributed by #12