Howdy,
I have been running into a bunch of issues trying to get this model to work. I don't really know what I am doing wrong I think it may have something to do with my ai connection file. I have tried with both OpenAI which generates output but it looks like it is not actually using my api key since it isn't consuming credits. After fiddling with it I switched to llama which just crashes ArcGIS Pro.
Here's how OpenAI .ais file looks:
{
"service_provider" : "OpenAI",
"api_key" : "API_KEY"
}
Here's how the llama .ais file looks:
You have this question posted in the ArcGIS Living Atlas of the World place in Community.
Is this your intended location? or are you looking to move it elsewhere like
ArcGIS Image Analyst - Esri Community
@DanPatterson I was a little bit iffy on where exactly to post it. Maybe I'll cross post it.
Hi @cepsgis,
To use OpenAI models, your .ais file should be structured as follows:
{
"service_provider": "OpenAI",
"api_key": "your_api_key",
"deployment_name": "gpt-4o" // Change this to the model you want to use (e.g., gpt-4o, gpt-4)
}
For using a local LLaMA model, your .ais file should look like this:
{
"service_provider": "local-llama"
}
To use a LLaMA model locally, follow these steps:
Also after these steps make sure that you have got the llama weights at C:\Users\<username>\.cache\huggingface\hub\models--meta-llama--Llama-3.2-11B-Vision-Instruct location in your machine.
This should help you set up the models correctly. Let me know if you run into any issues!
Okay, after extensive debugging, I’ve identified the issue. Running this model with LocalLLaMA isn’t possible because the deep learning libraries currently available through Esri do not meet the minimum requirements—specifically, Torch and Transformers. Any suggestions?
Hi @cepsgis,
The required versions of the libraries are packaged within the DLPK itself. If you're encountering any specific errors while trying to run the model, could you please share the details? I'd be happy to help troubleshoot further.
Classify Objects Using Deep Learning
=====================
Tool Path
Input Raster 615950.sid
Output Classified Objects Feature Class C:\batch\vision-language\Default.gdb\c615950_ClassifyObjectsUsing
Model Definition C:\batch\vision-language\VisionLanguageClassification.dlpk
Input Features
Class Label Field ClassLabel
Processing Mode PROCESS_AS_MOSAICKED_IMAGE
Arguments classes 'Grass, Rock';additional_context 'You are looking at arieal imagery, you need to find all the rock and grass for this imagery';strict_classification false;ai_connection_file C:\batch\vision-language\scripts\ai_connection_file.json
Caption Caption
=====================
Messages
Start Time: Friday, April 25, 2025 8:07:14 AM
ERROR 999999: Something unexpected caused the tool to fail. Contact Esri Technical Support (http://esriurl.com/support) to Report a Bug, and refer to the error help for potential solutions or workarounds.
Unable to obtain configuration properties associated with the raster function.
Traceback (most recent call last):
File "C:\Users\BCSERE~1\AppData\Local\Temp\ArcGISProTemp15504\VisionLanguageClassification.dlpk\VisionLanguageClassification.py", line 352, in getConfiguration
import torch
ModuleNotFoundError: No module named 'torch'
Configuration properties returned by the python raster function is not a python dictionary.
Failed to execute (ClassifyObjectsUsingDeepLearning).
Failed at Friday, April 25, 2025 8:07:15 AM (Elapsed Time: 0.50 seconds)
Hi @cepsgis,
Before using the Llama vision model, ensure that the supported deep learning libraries are installed. For more details, check the Deep Learning Libraries Installer for ArcGIS. Torch is part of the deep learning libraries installer.
Deep Learning Libraries ship with pytorch 2.0.1