Step-by-Step: Training a Vision Model with FastAI in Microsoft Fabric

7:07

Embarking on a journey of deep learning with Microsoft Fabric, we’re taking on an intriguing task: differentiating between pangolins and armadillos using a fastai vision model. This blog will be your comprehensive guide to training and deploying a model that can accurately distinguish these animals. Leveraging Microsoft Fabric’s advanced analytics, we’ll utilize fastai to build a model that’s both effective and quick to implement. Whether you're a data enthusiast or a seasoned analyst, join us as we delve into the intricacies of machine learning and outline the steps to accomplish this fascinating challenge.

Inspiration for this post and many of the implementation details come from the fantastic Practical Deep Learning for Coders course. My contributions here demonstrate how to get this working in Fabric with mlflow. To better understand the code in this blog (e.g., what is a datablock?), check out lesson 1 of the course above.

In this blog, you will find:

Step 1 - Create an ML Model

Step 2 - Train Your Model

Step 3 - Save your ML Model

Step 4 - Get the Model RunID

Step 5 - Load and Predict

Next Steps: Explore 'Fabric and Copilot' Recordings

Conclusion

vectors graphic showing a computer with coding lines.

You may be interested in these blogs:

💰 Efficient Cost Management with Copilot for PowerBI: A Complete Guide

📈 Power BI Usage Metrics Across All Workspaces: Step-by-Step

📊 How AI Data Analysis Enhances Analytics: Key Benefits & Top Tools

📓 Installing the ArcGIS Python Module in a Fabric Notebook: Step-by-Step

Step 1: Create an ML Model

Begin by creating a new ML model in the Data Science section of Microsoft Fabric.

picture of creating a new ML model in the Data Science section of Microsoft Fabric.

You'll be prompted to name your model. Since we’re classifying images of pangolins and armadillos, I’ve named mine pangolinVsArmadillo.

Next, click on “Start with a new Notebook.”

picture of creating a new ml model version by starting with a new notebook

Step 2 - Train Your Model

The following commands will download images of pangolins and armadillos, create a datablock, and train your ML model using mlflow and fastai. Add these to your new Notebook, each section as its own cell.

Install and Import Requirements:

!pip install fastbook
from fastbook import *

Download images of pangolins and armadillos using duck duck go:

#search and save images using duckduckgo (ddg)
searches = 'pangolin', 'armadillo'
path = Path('pangolin_or_not')

if not path.exists():
    path.mkdir(exist_ok=True)
    for o in searches:
        dest = (path/o)
        dest.mkdir(exist_ok=True)
        results = search_images_ddg(f'{o} photo')
        download_images(dest, urls=results[:200])
        resize_images(dest, max_size=400, dest=dest)

Some warnings/errors will show up in the output of the above cell, they can be ignored.

Remove any failed downloads or files that aren’t valid images

#remove any bad images (images that can't be opened)
failed = verify_images(get_image_files(path))
failed.map(Path.unlink);
failed

Some warnings/errors will show up in the output of the above cell, but they can be ignored.

Create the fastai datablock, load in the downloaded pictures, and display 9 of them in the cell output.

#create your fastai datablock
dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=[Resize(192, method='squish')]
).dataloaders(path)

dls.show_batch(max_n=9)

Train Your Vision Model

In this case, we’re using resnet18 as our base model and training for only one epoch. Increasing the number of epochs and/or changing the base model can improve model accuracy.

#train and track your model using mlflow and fastai.  For test/demo purposes, we'll only do a 1 epoch of training
import mlflow.fastai
from mlflow import MlflowClient

def print_auto_logged_info(r):
    tags = {k: v for k, v in r.data.tags.items() if not k.startswith("mlflow.")}
    artifacts = [f.path for f in MlflowClient().list_artifacts(r.info.run_id, "model")]
    print(f"run_id: {r.info.run_id}")
    print(f"artifacts: {artifacts}")
    print(f"params: {r.data.params}")
    print(f"metrics: {r.data.metrics}")
    print(f"tags: {tags}")

def main(epochs=1):
    model = vision_learner(dls, resnet18, metrics=error_rate)

    # Enable auto logging
    mlflow.fastai.autolog()

    # Start MLflow session
    with mlflow.start_run() as run:
        #model.fit(epochs, learning_rate)
        model.fine_tune(epochs)

    # fetch the auto logged parameters, metrics, and artifacts
    print_auto_logged_info(mlflow.get_run(run_id=run.info.run_id))

main()

After running all the above cells, you should see something like this as the output of the final cell: trainingComplete

⚠️ The accuracy of this model is only 86% (1 - 0.138888 * 100), adding more epochs or changing the base model will help improve this.

You should also see a new Experiment in your workspace (you may need to refresh your browser window):

mlExperiment

Step 3 - Save your ML Model

Open the new experiment that’s been created in your workspace. Click on Save run as ML model

saveMLModel

Click on “Select an existing ML model”, select the model you created and click Save.

existingMLModel

Step 4 - Get the Model RunID

Open up the ML Model you created, expand Version 1, expand model, click on MLmodel and then copy the run id:

modelRunID

Step 5 - Load and Predict

Create a new notebook in your workspace. Add the following code to your notebook.

Install and import required modules and download a single image of a creature - could be a pangolin or an armadillo - it’s up to you.

!pip install fastbook
import mlflow
import mlflow.fastai
from fastbook import *

#search and save an image of a pangolin - can be changed to armadillo in the line below if you want
urls = search_images_ddg('pangolin photos', max_images=1)
len(urls),urls[0]
#save image as creature.jpg
dest = Path('creature.jpg')
if not dest.exists(): download_url(urls[0], dest, show_progress=False)
im = Image.open(dest)
im.to_thumb(256, 256)

Load the ML model we trained and saved.

#load the model via the runID
model = mlflow.fastai.load_model(f"runs:/[[enter your runID here]]/model")

❗To be honest, I’m not sure if the above is the correct way to load a saved model in Fabric. The “Apply this version” code that Fabric can auto-create didn’t work for me, and the above does allow me to make predictions from a separate Fabric notebook, so I’m going with this for now. If you know the correct way to load a saved model, please do let me know.

Run the predict function with the image we just downloaded to see if it’s a pangolin or an armadillo. Include the result variable to see how confident the model is with its prediction.

#use the loaded model to see if the image was a pangolin or armadillo
result = model.predict(PILImage.create('creature.jpg'))
print(f"It's a {result[0]}.")
print(result)

The result from the notebook should look like this:

modelPrediction

Next Steps: Explore 'Fabric and Copilot' Recordings

Dive deeper into Microsoft Fabric and uncover the full spectrum of its capabilities. Explore how this powerful platform enables organizations to efficiently manage, analyze, and harness valuable data and insights.

Fabric and copilot for microsoft 365 webinat banner

By registering for the Copilot Virtual Briefing Sessions, you'll gain exclusive access to a wealth of information, including the recording of "Fabric and Copilot." Join Scott Sugar as he delves into the intricacies of Fabric, showcasing how Copilot can revolutionize your data analysis workflows and enhance your decision-making processes. Register now and unlock the true potential of Microsoft Fabric and elevate your data transformation journey to new heights.

Moreover, don't miss the exclusive Copilot Virtual Briefing Session every Thursday in July and August. These sessions are truly special, allowing you to delve into Copilot and Copilot for Microsoft 365 and experience persona-based demonstrations. Be sure to check your schedule, register for the sessions that align with your availability and interests, and gain access to the insightful recordings on Copilot for Security, AI Strategy for Businesses, and Copilot for Sales.

Conclusion

As we conclude this technical walkthrough, you’re now equipped with the knowledge to train and deploy a fastai vision model in Microsoft Fabric that can distinguish between pangolins and armadillos—or other creatures if you change the search terms.

This exploration of deep learning within Microsoft Fabric demonstrates that with the right tools and guidance, creating precise and efficient models is attainable. We hope this blog has illuminated the path for your own projects and inspired you to leverage fastai and Microsoft Fabric for your deep learning endeavors.

Unleash the Full Potential of Your Data Transformation

Ready to take your data transformation to the next level? ProServeIT can help! As a Microsoft Solutions Partner in Data & AI, our team of certified professionals is here to empower your organization to leverage the full potential of your data.

Contact ProServeIT today and schedule a consultation with our data specialists.

Tags:

Data & Analytics

By Scott Sugar
July 26, 2024

From the 1980's, when his father used to hand down old computer equipment, to now, Scott Sugar has always had a fascination with technology. The ability to communicate with people, regardless of distance or location, is, in Scott's opinion, one of the best things about tech. With over 20 years of experience in the IT industry, and 17 years at ProServeIT, Scott's areas of expertise include data & analytics, and IT operations, monitoring, and alerting. Scott heads up ProServeIT's Ho Chi Minh City, Vietnam office. He has spent the majority of his adult life in Asia, and speaks 3 languages.

Step-by-Step: Training a Vision Model with FastAI in Microsoft Fabric

Step 1: Create an ML Model

Step 2 - Train Your Model

Train Your Vision Model

Step 3 - Save your ML Model

Step 4 - Get the Model RunID

Step 5 - Load and Predict

Next Steps: Explore 'Fabric and Copilot' Recordings

Conclusion

Unleash the Full Potential of Your Data Transformation

Related Articles

Big Data Analytics: How Business Leaders Can Turn Data into Actionable Strategies

Data Visualization Best Practices for Effective Decision-Making

What Is Data Cleaning and Modeling in Microsoft Fabric?

Comments