Michele Dolfi
|
d4c87133f3
|
feat: Introduce pluggable VLM runtime system with preset-based configuration (#2919)
* model runtime refactoring
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add test
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fix code formula preset
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* batch prediction
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use presets and new vlm options in CLI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use new model settings by default
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* running
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update examples
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fixes for running examples
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* keep old stage
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update model
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use granite 3.3 and set options
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* revisit init logic and propagate the proper options to the runtimes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update all stages with original setup
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* per stage registry
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use chat template
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove duplicated predict() and factor out some utils
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* working picture description examples
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add granite docling as code formula model
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename code formula presets
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fix running minimal_vlm example
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add all models to presets and run compare_vlm
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* remove unused repo_id
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update vlm api model example
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fix legacy examples
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add another legacy example
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fix test
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* avoid automatic fallback to mlx and fix end_of_utterance in codeformula
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* move vlm_convert_model
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* use new vlm runtime class
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* flasg for CI
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename runtimes to explicit vlm_runtimes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* renaming from runtime to inference engine and model families
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fixes
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* fix test
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add docs with stages
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* update docs catalog page
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* rename runtime to inference engine
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2026-02-04 17:29:17 +01:00 |
|