AimStack - Medium

Launching Aim on Hugging Face Spaces

Gor Arakelyan — Tue, 25 Apr 2023 17:29:56 GMT

We are excited to announce the launch of Aim on Hugging Face Spaces! 🚀

With just a few clicks, you can now deploy Aim on the Hugging Face Hub and seamlessly share your training results with anyone.

Aim is an open-source, self-hosted AI Metadata tracking tool. It provides a performant and powerful UI for exploring and comparing metadata, such as training runs or AI agents executions. Additionally, its SDK enables programmatic access to tracked metadata — perfect for automations and Jupyter Notebook analysis.

In this article, you will learn how to deploy and share your own Aim Space. Also, we will have a quick tour over Aim and learn how it can help to explore and compare your training logs with ease. Let’s dive in and get started!

Learn more about Aim on the GitHub repository: github.com/aimhubio/aim

Deploy Aim on Hugging Face Spaces within seconds using the Docker template

To get started, simply navigate to the Spaces page on the Hugging Face Hub and click on the “Create new Space” button, or open the page directly by the following link: https://huggingface.co/new-space?template=aimstack/aim

Set up your Aim Space in no time:

Choose a name for your Space.
Adjust Space hardware and the visibility mode.
Submit your Space!

After submitting the Space, you’ll be able to monitor its progress through the building status:

Once it transitions to “Running”, your space is ready to go!

Ta-da! 🎉 You’re all set to start using Aim on Hugging Face.

By pushing your logs to your Space, you can easily explore, compare, and share them with anyone who has access. Here’s how to do it in just two simple steps:

Run the following bash command to compress .aim directory:

tar -czvf aim_repo.tar.gz .aim

2. Commit and push files to your Space.

That’s it! Now open the App section of your Space, and Aim will display your training logs.

Updating Spaces is incredibly convenient — you just need to commit the changes to the repository, and it will automatically re-deploy the application for you. 🔥

See Aim in Action with Existing Demos on the Hub

Let’s explore live Aim demos already available on the Hub. Each demo highlights a distinct use case and demonstrates the power of Aim in action.

Neural machine translation task: https://huggingface.co/spaces/aimstack/nmt
Simple handwritten digits recognition task: https://huggingface.co/spaces/aimstack/digit-recognition
Image generation task with lightweight GAN implementation: https://huggingface.co/spaces/aimstack/image-generation

When navigating to your Aim Space, you’ll see the Aim homepage, which provides a quick glance at your training statistics and an overview of your logs.

Open the individual run page to find all the insights related to that run, including tracked hyper-parameters, metric results, system information (CLI args, env vars, Git info, etc.) and visualizations.

Take your training results analysis to the next level with Aim’s Explorers — tools that allow to deeply compare tracked metadata across runs.

Metrics Explorer, for instance, enables you to query tracked metrics and perform advanced manipulations such as grouping metrics, aggregation, smoothing, adjusting axes scales and other complex interactions.

Explorers provide fully Python-compatible expressions for search, allowing you to query metadata with ease.

In addition to Metrics Explorer, Aim offers a suite of Explorers designed to help you explore and compare a variety of media types, including images, text, audio, and Plotly figures.

Use Aim Space on Hugging Face to effortlessly upload and share your training results with the community in a few steps. Aim empowers you to explore your logs with interactive visualizations at your fingertips, easily compare training runs at scale and be on top of your ML development insights!

One more thing… 👀

Having Aim logs hosted on Hugging Face Hub, you can embed it in notebooks and websites.

To embed your Space, construct the following link based on Space owner and Space name: https://owner-space-name.hf.space. This link can be used to embed your Space in any website or notebook using the following HTML code:

%%html
    src="https://owner-space-name.hf.space"
    frameborder="0"
    width=100%
    height="800"
>

Next steps

We are going to continuously iterate over Aim Space onboarding and usability, including:

the ability to read logs directly from Hugging Face Hub model repos,
automatic conversion of TensorBoard logs to Aim format,
Aim HF Space-specific onboarding steps.

Much more coming soon… stay tuned for the updates!

Learn more

Check out Aim Space documentation here

Aim repo on GitHub: github.com/aimhubio/aim

If you have questions, join the Aim community, share your feedback, open issues for new features and bugs. You’re most welcome! 🙌

Drop a ⭐️ on GitHub, if you find Aim useful.

This article was originally published on Aim blog. Find more in depth guides and details of the newest releases there.

Launching Aim on Hugging Face Spaces was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

LangChain + Aim: Building and Debugging AI Systems Made EASY!

Gor Arakelyan — Thu, 06 Apr 2023 18:50:40 GMT

Generated by Midjourney. Prompted by ChatGPT.

The Rise of Complex AI Systems

With the introduction of ChatGPT and large language models (LLMs) such as GPT3.5-turbo and GPT4, AI progress has skyrocketed. These models have enabled tons of AI-based applications, bringing the power of LLMs to real-world use cases.

But the true power of AI comes when we combine LLMs with other tools, scripts, and sources of computation to create much more powerful AI systems than standalone models.

As AI systems get increasingly complex, the ability to effectively debug and monitor them becomes crucial. Without comprehensive tracing and debugging, the improvement, monitoring and understanding of these systems become extremely challenging.

In this article, we will take a look at how to use Aim to easily trace complex AI systems built with LangChain. Specifically, we will go over how to:

track all inputs and outputs of chains,
visualize and explore individual chains,
compare several chains side-by-side.

LangChain: Building AI Systems with LLMs

LangChain is a library designed to enable the development of powerful applications by integrating LLMs with other computational resources or knowledge sources. It streamlines the process of creating applications such as question answering systems, chatbots, and intelligent agents.

It provides a unified interface for managing and optimizing prompts, creating sequences of calls to LLMs or other utilities (chains), interacting with external data sources, making decisions and taking actions. LangChain empowers developers to build sophisticated, cutting-edge applications by making the most of LLMs and easily connecting them with other tools!

Aim: Upgraded Debugging Experience for AI Systems

Monitoring and debugging AI systems requires more than just scanning output logs on a terminal.

Introducing Aim!

Aim is an open-source AI metadata library that tracks all aspects of your AI system’s execution, facilitating in-depth exploration, monitoring, and reproducibility.

Importantly, Aim helps to query all the tracked metadata programmatically and is equipped with a powerful UI / observability layer for the AI metadata.

In that way, Aim makes debugging, monitoring, comparing different executions a breeze.

Experience the ultimate control with Aim!

Check out Aim on GitHub: github.com/aimhubio/aim

Aim + LangChain = 🚀

With the release of LangChain v0.0.127, it’s now possible to trace LangChain agents and chains with Aim using just a few lines of code! All you need to do is configure the Aim callback and run your executions as usual.

Aim does the rest for you by tracking tools and LLMs’ inputs and outputs, agents’ actions, and chains results. As well as, it tracks CLI command and arguments, system info and resource usage, env variables, git info, and terminal outputs.

Let’s move forward and build an agent with LangChain, configure Aim to trace executions, and take a quick journey around the UI to see how Aim can help with debugging and monitoring.

Hands-On Example: Building a Multi-Task AI Agent

Setting up the agent and the Aim callback

Let’s build an agent equipped with two tools:

the SerpApi tool to access Google search results,
the LLM-math tool to perform required mathematical operations.

In this particular example, we’ll prompt the agent to discover who Leonardo DiCaprio’s girlfriend is and calculate her current age raised to the 0.43 power:

tools = load_tools(["serpapi", "llm-math"], llm=llm, callback_manager=manager)
agent = initialize_agent(
    tools,
    llm,
    agent="zero-shot-react-description",
    callback_manager=manager,
    verbose=True,
)
agent.run(
    "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
)

Now that the chain is set up, let’s integrate the Aim callback. It takes just a few lines of code and Aim will capture all the moving pieces during the execution.

from langchain.callbacks import AimCallbackHandler

aim_callback = AimCallbackHandler(
    repo=".",
    experiment_name="scenario 1: OpenAI LLM",
)
aim_callback.flush_tracker(langchain_asset=agent, reset=False, finish=True)

Aim is entirely open-source and self-hosted, which means your data remains private and isn’t shared with third parties.

Find the full script and more examples in the official LangChain docs: https://python.langchain.com/docs/ecosystem/integrations/aim_tracking

Executing the agent and running Aim

Before executing the agent, ensure that Aim is installed by executing the following command:

pip install aim

Now, let’s run multiple executions and launch the Aim UI to visualize and explore the results:

execute the script by running python example.py,
then, start the UI with aim up command.

With the Aim up and running, you can effortlessly dive into the details of each execution, compare results, and gain insights that will help you to debug and iterate over your chains.

Exploring executions via Aim

Home page

On the home page, you’ll find an organized view of all your tracked executions, making it easy to keep track of your progress and recent runs. To navigate to a specific execution, simply click on the link, and you’ll be taken to a dedicated page with comprehensive information about that particular execution.

Deep dive into a single execution

When navigating to an individual execution page, you’ll find an overview of system information and execution details. Here you can access:

CLI command and arguments,
Environment variables,
Packages,
Git information,
System resource usage,
and other relevant information about an individual execution.

Aim automatically captures terminal outputs during execution. Access these logs in the “Logs” tab to easily keep track of the progress of your AI system and identify issues.

In the “Text” tab, you can explore the inner workings of a chain, including agent actions, tools and LLMs inputs and outputs. This in-depth view allows you to review the metadata collected at every step of execution.

With Aim’s Text Explorer, you can effortlessly compare multiple executions, examining their actions, inputs, and outputs side by side. It helps to identify patterns or spot discrepancies.

For instance, in the given example, two executions produced the response, “Camila Morrone is Leo DiCaprio’s girlfriend, and her current age raised to the 0.43 power is 3.8507291225496925.” However, another execution returned the answer “3.991298452658078”. This discrepancy occurred because the first two executions incorrectly identified Camila Morrone’s age as 23 instead of 25.

With Text Explorer, you can easily compare and analyze the outcomes of various executions and make decisions to adjust agents and prompts further.

Wrapping Up

In conclusion, as AI systems become more complex and powerful, the need for comprehensive tracing and debugging tools becomes increasingly essential. LangChain, when combined with Aim, provides a powerful solution for building and monitoring sophisticated AI applications. By following the practical examples in this blog post, you can effectively monitor and debug your LangChain-based systems!

Learn more

Check out the Aim + LangChain integration docs here.

LangChain repo: https://github.com/hwchase17/langchain

Aim repo: https://github.com/aimhubio/aim

If you have questions, join the Aim community, share your feedback, open issues for new features and bugs. You’re most welcome! 🙌

Drop a ⭐️ on GitHub, if you find Aim useful.

This article was originally published on Aim blog. Find more in depth guides and details of the newest releases there.

LangChain + Aim: Building and Debugging AI Systems Made EASY! was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Reproducible forecasting with Prophet and Aim

Davit Grigoryan — Tue, 14 Mar 2023 18:55:52 GMT

You can now track your Prophet experiments with Aim! The recent Aim v3.16 release includes a built-in logger object for Prophet runs. It tracks Prophet hyperparameters, arbitrary user-defined metrics, extra feature variables, and system metrics. These features, along with the intuitive Aim UI and its pythonic search functionality can significantly improve your Prophet workflow and accelerate the model training and evaluation process.

What is Aim?

Aim is the fastest open-source tool for AI experiment comparison. With more resources and complex models, more experiments are ran than ever. Aim is used to deeply inspect thousands of hyperparameter-sensitive training runs.

What is Prophet?

Prophet is a time series forecasting algorithm developed by Meta. Its official implementation is open-sourced, with libraries for Python and R.

A key benefit of Prophet is that it does not assume stationarity and can automatically find trend changepoints and seasonality. This makes it easy to apply to arbitrary forecasting problems without many assumptions about the data.

Tracking Prophet experiments with Aim

Prophet isn’t trained like neural networks, so you can’t track per-epoch metrics or anything of the sort. However, Aim allows the user to track Prophet hyperparameters and arbitrary user-defined metrics to compare performance across models, as well as system-level metrics like CPU usage. In this blogpost we’re going to see how to integrate Aim with your Prophet experiments with minimal effort. For many more end-to-end examples of usage with other frameworks, check out the repo.

In the simple example below, we generate synthetic time series data in the format required by Prophet, then train and test a single model with default hyperparameters, tracking said hyperparameters using an AimLogger object. Then, we calculate MSE and MAE and add those metrics to the AimLogger. As you can see, this snippet can easily be extended to work with hyperparameter search workflows.

from aim.from aim.prophet import AimLogger
...

model = Prophet()
logger = AimLogger(prophet_model=model, repo=".", experiment="prophet_test")
...

metrics = {
    "mse": mean_squared_error(test["y"], preds.iloc[4000:]["yhat"]),
    "mae": mean_absolute_error(test["y"], preds.iloc[4000:]["yhat"]),
}
logger.track_metrics(metrics)

Additionally, if you’re working with multivariate time series data, you can use Aim to track different dependent variable configurations to see which combination results in the best performance (however, be mindful of the fact that you need to know the future values of your features to forecast your target variable). Here’s a simple code snippet doing just that:

# Here, we add an extra variable to our dataset
data = pd.DataFrame(
    {
   "y": np.random.rand(num_days, 1),
   "ds": rng,
   "some_feature": np.random.randint(10, 20, num_days),
  }
)

model = Prophet()
# Prophet won't use the "some_feature" variable without the following line
model.add_regressor("some_feature")
logger = AimLogger(prophet_model=model, repo=".", experiment="prophet_with_some_feature")

Now, the extra feature(s) will be tracked as a hyperparameter called extra_regressors.

Take a look at a simple, end-to-end example here.

Viewing experiment logs

After running the experiment, we can view the logs by executing aim up from the command line in the aim_logs directory. When the UI is opened, we can see the logs of all the experiments with their corresponding metrics and hyperparameters by navigating to prophet_test/runs and selecting the desired run.

Additionally, we can monitor system metrics, environment variables, and packages installed in the virtual environment, among other things.

These features make it easy to track and compare different Prophet experiments on an arbitrary number of metrics, both accuracy and performance-related.

Using Aim’s Pythonic search to filter metrics and hyperparameters

Given the same dataset and the same hyperparameters, Prophet is guaranteed to produce the same exact model, which means the metrics will be the same. However, if we’re working with several time series (e.g. forecasting demand using different factors), we might want to fit many different Prophet models to see which factors have a bigger effect on the target variable. Similarly, we might want to filter experiments by hyperparameter values. In cases like these, Aim’s pythonic search functionality can be super useful. Say we only want to see the models with MAE ≤ 0.25. We can go to the metrics section and search exactly like we would in Python, say ((metric.name == "mae") and (metric.last <= 0.25)) (the last part is meant for neural networks, where you might want to see the metric at the last epoch). Here’s a visual demonstration of this feature:

As you can see, filtering based on the metrics is super easy and convenient. The pythonic search functionality can also be used to filter based on other parameters.

Conclusion

To sum up, Aim’s integration with Prophet allows one to easily track an arbitrary amount of Prophet runs with different hyperparameters and feature variable configurations, as well as arbitrary user-defined metrics, while also allowing one to monitor system performance and see how much resources model training consumes. Aim UI also makes it easy to filter runs based on hyperparameter and metric values with its pythonic search functionality. All these features can make forecasting with Prophet a breeze!

Learn More

Aim is on a mission to democratize AI dev tools.

We have been incredibly lucky to get help and contributions from the amazing Aim community. It’s humbling and inspiring 🙌

Try out Aim, join the Aim community, share your feedback, open issues for new features and bugs.

Reproducible forecasting with Prophet and Aim was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Hugging the Chaos: Connecting Datasets to Trainings with Hugging Face and Aim

Gor Arakelyan — Thu, 16 Feb 2023 21:20:15 GMT

Generated by Midjourney

The cost of neglecting experiments management

Working with large and frequently changing datasets is hard!
You can easily end up in a mess if you don’t have a system that traces dataset versions and connects them to your experiments. Moreover, the lack of traceability can make it impossible to effectively compare your experiments. Let alone the reproducibility.

In this article, we will explore how you can combine Hugging Face Datasets and Aim to make machine learning experiments traceable, reproducible and easier to compare.

Let’s dive in and get started!

Hugging Face Datasets + Aim = ❤

Hugging Face Datasets is a fantastic library that makes it super easy to access and share datasets for audio, computer vision, and NLP tasks. With Datasets, you’ll never have to worry about manually loading and versioning your data again.

Aim, on the other hand, is an easy-to-use and open-source experiment tracker with lots of superpowers. It enables easily logging training runs, comparing them via a beautiful UI, and querying them programmatically.

Hugging Face Datasets and Aim combined are a powerful way to track training runs and their respective dataset metadata. So you can compare your experiments based on the dataset version and reproduce them super seamlessly.

Project overview

Let’s go through an object classification project and show how you can use Datasets to load and version your data, and Aim to keep track of everything along the way.

Here’s what we’ll be using:

Hugging Face Datasets to load and manage the dataset.
Hugging Face Hub to host the dataset.
PyTorch to build and train the model.
Aim to keep track of all the model and dataset metadata.

Our dataset is going to be called “A-MNIST” — a version of the “MNIST” dataset with extra samples added. We’ll start with the original “MNIST” dataset and then add 60,000 rotated versions of the original training samples to create a new, augmented version. We will run trainings on both dataset versions and see how the models are performing.

Dataset preparation

Uploading the dataset to Hub

Let’s head over to Hugging Face Hub and create a new dataset repository called “A-MNIST” to store our dataset.

Currently, the repository is empty, so let’s upload the initial version of the dataset.

We’ll use the original MNIST dataset, as the first version v1.0.0. To do this, we’ll need to upload the dataset files along with the dataset loading script A-MNIST.py. The dataset loading script is a python file that defines the different configurations and splits of the dataset, as well as how to download and process the data:

class AMNIST(datasets.GeneratorBasedBuilder):
    """A-MNIST Data Set"""

    BUILDER_CONFIGS = [
            datasets.BuilderConfig(
                name="amnist",
                version=datasets.Version("1.0.0"),
                description=_DESCRIPTION,
            )
        ]
    ...

See the full script here: https://huggingface.co/datasets/gorar/A-MNIST/blob/main/A-MNIST.py

We’ll have the following setup, including all the necessary git configurations and the dataset card:

- `data` - contains the dataset.
    - `t10k-images-idx3-ubyte.gz` - test images.
    - `t10k-labels-idx1-ubyte.gz` - test labels.
    - `train-images-idx3-ubyte.gz` - train images.
    - `train-labels-idx1-ubyte.gz` - train labels.
- `A-MNIST.py` - the dataset loading script.

Let’s commit the changes and push the dataset to the Hub.

Awesome! Now we’ve got the first version of “A-MNIST” hosted on Hugging Face Hub!

Augmenting the dataset

Next up, let’s add more images to the MNIST dataset using augmentation techniques.

We will rotate all the train images by 20 degree and append to the dataset:

rotated_images[i] = rotate(images[i], angle=20, reshape=False)

As well as, update the “A-MNIST” dataset version to 1.1.0:

BUILDER_CONFIGS = [
    datasets.BuilderConfig(
        name="amnist",
        version=datasets.Version("1.1.0"),
        description=_DESCRIPTION,
    )
]

After preparing the new train dataset, let’s commit the changes and push the upgraded version to the hub.

Perfect, v1.1.0 is now available on the Hub! This means that we can easily access it in the training script using the Hugging Face Datasets. 🎉

Training setup

Let’s load the dataset using datasets python package. It’s really simple, just one line of code:

dataset = load_dataset("gorar/A-MNIST")

One of the amazing features of datasets is that is has native support for PyTorch, which allows us to prepare and initialize dataset loaders with ease:

from datasets import load_dataset
from torch.utils.data import DataLoader

# Loading the dataset
dataset = load_dataset("gorar/A-MNIST")

# Defining train and test splits
train_dataset = dataset['train'].with_format('torch')
test_dataset = dataset['test'].with_format('torch')

# Initializing data loaders for train and test split
train_loader = DataLoader(dataset=train_dataset, batch_size=4, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=4, shuffle=True)

We’re ready to write the training script and build the model. We are using PyTorch to build a simple convolutional neural network with two convolutional layers:

class ConvNet(nn.Module):
    def __init__(self, num_classes=10):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=2),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.fc = nn.Linear(7 * 7 * 32, num_classes)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.reshape(out.size(0), -1)
        out = self.fc(out)
        return out
...

See the full code here: https://gist.github.com/gorarakelyan/936fb7b8fbde4de807500c5617b47ea8

We’ve got everything ready now. Let’s kick off the training by running:

python train.py

Hurray, the training is underway! 🏃

It would be tough to monitor my training progress on terminal. This is where Aim’s superpowers come into play.

Integrating Aim

Trainings tracking

Aim offers a user-friendly interface to keep track of your training hyper-parameters and metrics:

import aim

aim_run = aim.Run()

# Track hyper-parameters
aim_run['hparams'] = {
    'num_epochs': num_epochs,
    'num_classes': num_classes,
    ...
}

# Track metrics
for i, samples in enumerate(train_loader):
    ...
    aim_run.track(acc, name='accuracy', epoch=epoch, context={'subset': 'train'})
    aim_run.track(loss, name='loss', epoch=epoch, context={'subset': 'train'})

Furthermore, the awesome thing about Aim is that it is tightly integrated with ML ecosystem.

Since v3.16, Aim has a builtin integration with Hugging Face Datasets. Simply import HFDataset module from Aim and you can track all of your dataset metadata with just a single line of code!

from aim.hf_dataset import HFDataset

aim_run['dataset'] = HFDataset(dataset)

Aim will automagically gather the metadata from the dataset instance, store it within the Aim run, and display it on the Aim UI run page, including details like the dataset’s description, version, features, etc.

Experimentation

Conducting trainings

We’ll perform several training runs on both versions of the datasets and adjust learning rates to evaluate their impact on the training process. Specifically, we’ll use 1.0.0 and 1.1.0 versions of our dataset, and experiment with 0.01, 0.03, 0.1, 0.3 learning rates.

Datasets provide the ability to choose which dataset version we want to load. To do this, we just need to pass version and git revision of the dataset we want to work with:

# Loading A-MNIST v1.0.0
dataset = load_dataset("gorar/A-MNIST", version='1.0.0', revision='dcc966eb0109e31d23c699199ca44bc19a7b1b47')

# Loading A-MNIST v1.1.0
dataset = load_dataset("gorar/A-MNIST", version='1.1.0', revision='da9a9d63961462871324d514ca8cdca1e5624c5c')

We will run a simple bash script to execute the trainings. However, for a more comprehensive hyper-parameters tuning approach, tools like Ray Tune or other tuning frameworks can be used.

Exploring training results via Aim

Now that we have integrated Aim into our training process, we can explore the results in a more user-friendly way. To launch Aim’s UI, we need to run the aim up command. A following message would be printed on the terminal output meaning the UI is successfully running:

To access the UI, let’s navigate to 127.0.0.1:43800 in the browser.

We can see our trainings on the contributions map and in the activity feed. Let’s take a closer look at an individual run details by navigating to the run’s page on the UI.

All of the information is displayed in an organized and easy-to-read manner, allowing us to quickly understand how our training is performing and identify any potential issues.

We can view the tracked hyper-parameters, metadata related to the dataset, and metrics results. Additionally, Aim automatically gathers information like system resource usage and outputs from the terminal.

The UI also provides tools for visualizing and comparing results from multiple runs, making it easy to compare different model architectures, hyper-parameter and dataset settings to determine the best approach for a given problem.

Comparing trainings via Aim

Let’s find out which models performed the best. We can easily compare the metrics of all the runs by opening up the Metrics Explorer 127.0.0.1:43800/metrics.

Aim is very powerful when it comes to filtering runs and grouping. It provides an ability to group metrics based on any tracked metadata value.

Let’s group by run.dataset.dataset.meta.version to see how the models performed based on the dataset version they were trained on.

According to the legends section, the green lines represent models that were trained on the v1.0.0 dataset (the original MNIST dataset), while the blue metrics represent models trained on v1.1.0, an augmented dataset.

Now, to improve visibility and better evaluate performance, let’s go ahead and smooth out these lines.

It appears that both models performed similarly on both dataset versions. Since we also experimented with the learning rate hyper-parameter, let’s group metrics by the learning rate to learn its impact on models performance.

It seems that the blue lines, associated with a learning rate of 0.01, demonstrated the best performance!

To sum it up, regardless of the dataset version, the models trained with a learning rate of 0.01 came out on top in terms of performance.

With just a few clicks, we were able to compare different runs based on the dataset and learning rate value they were trained on! 🎉

Reproducibility

Have you noticed how we achieved out-of-the-box reproducibility with the Hugging Face datasets and Aim integration? By versioning the dataset and keeping track of its revision, we can effortlessly reproduce experiments using the same dataset it was previously trained on.

Aim stores model hyper-parameters, dataset metadata and other moving pieces, which allows us to quickly recreate and analyze previous experiments!

Conclusion

In this article, we took a closer look at how to get started with uploading and using datasets through the Hugging Face Hub and Datasets. We carried out a series of trainings, and Aim helped us keep track of all the details and made it easier to stay organized.

We compared the results of the trainings conducted on different versions of the dataset. As well as, discovered how helpful the combination of Datasets and Aim can be for reproducing previous runs.

Simplifying our daily work with efficient stack helps us focus on what really matters — getting the best results from our machine learning experiments.

Learn more

Check out the Aim + Datasets integration docs here.
Datasets repo: https://github.com/huggingface/datasets
Aim repo: https://github.com/aimhubio/aim

If you have questions, join the Aim community, share your feedback, open issues for new features and bugs. You’re most welcome! 🙌

Drop a ⭐️ on GitHub, if you find Aim useful.

Hugging the Chaos: Connecting Datasets to Trainings with Hugging Face and Aim was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Aim v3.16 — Run messages in UI, TensorBoard real-time sync, integration with Hugging Face Datasets

Gor Arakelyan — Wed, 15 Feb 2023 18:58:28 GMT

Aim v3.16 — Run messages in UI, TensorBoard real-time sync, integration with Hugging Face Datasets

Hey community, excited to announce Aim v3.16 is out! 🚀 It is packed with new integrations and key enhancements.

We are on a mission to democratize AI dev tools and are incredibly lucky to have the support of the community. Every question and every issue makes Aim better!

Congratulations to timokau, dsblank and grigoryan-davit for their first contributions. 🙌

Key highlights

Run messages tab in UI

Starting from version 3.15, Aim has enabled an ability to log training messages and even send notifications to Slack or Workplace.

With the version 3.16, you’ll be able to view all your run messages right in the UI. All the information you need is just a click away! 💫

Real-time sync with TensorBoard logs

Aim smoothly integrates with experiment tracking tools. If you’re part of a team that’s already deeply integrated TensorBoard into your projects and pipelines, you will love this enhancement.

With just one simple line of code, you can integrate Aim with your existing TensorBoard projects. This means that all your logs will be automatically converted to Aim format in real-time. 🔥🔥🔥

from aim.ext.tensorboard_tracker import Run

aim_run = Run(
    sync_tensorboard_log_dir='TB_LOG_DIR_TO_SYNC_RUNS'
)

Support for Python 3.11

The Python community has received fantastic news towards the end of 2022. The stable Python 3.11 has been officially released!

With Aim 3.16 release, you can install and use Aim in your python 3.11 projects. 🎉

pip3.11 install aim

Dropped support for Python 3.6

Time to say goodbye: Python 3.6 has reached the end-of-life. 💀

This change allows us to take advantage of the latest advancements in the Python language and provide you with a more robust and reliable library.

Please note that if you have been using Aim in your Python 3.6 projects, you will need to upgrade to newer Python versions in order to continue using Aim.

We apologize for any inconvenience this may cause, but rest assured that the improved stability of Aim will make it worth the transition.

New integrations

Aim 3.16 is packed with new integrations with favorite ML tools!

Aim + Hugging Face Datasets = ❤

Happy to share now you can easily track and store dataset metadata in Aim run and explore it on the UI.

from datasets import load_dataset

from aim import Run
from aim.hf_dataset import HFDataset

# Load the dataset
dataset = load_dataset('rotten_tomatoes')

# Store the dataset metadata
run = Run()
run['datasets_info'] = HFDataset(dataset)

See the docs here.

Aim + Acme = ❤

Aim’s been added a built-in support for tracking Acme trainings. It takes few simple steps to integrate Aim into your training script.

Explicitly import the AimCallback and AimWriter for tracking training metadata:

from aim.sdk.acme import AimCallback, AimWriter

2. Initialize an Aim Run via AimCallback, and create a log factory:

aim_run = AimCallback(repo=".", experiment_name="acme_test")

def logger_factory(
    name: str,
    steps_key: Optional[str] = None,
    task_id: Optional[int] = None,
) -> loggers.Logger:
    return AimWriter(aim_run, name, steps_key, task_id)

3. Pass the logger factory to logger_factory upon initiating your training:

experiment_config = experiments.ExperimentConfig(
    builder=d4pg_builder,
    environment_factory=make_environment,
    network_factory=network_factory,
    logger_factory=logger_factory,
    seed=0,
    max_num_actor_steps=5000)

See the docs here.

Aim + Stable-Baselines3 = ❤

Now you can easily track Stable-Baselines3 trainings with Aim. It takes two steps to integrate Aim into your training script.

Explicitly import the AimCallback for tracking training metadata:

from aim.sb3 import AimCallback

2. Pass the callback to callback upon initiating your training:

model.learn(total_timesteps=10_000, callback=AimCallback(repo='.', experiment_name='sb3_test'))

See the docs here.

Learn more

Aim is on a mission to democratize AI dev tools. 🙌

Try out Aim, join the Aim community, share your feedback, open issues for new features, bugs.

Don’t forget to leave us a star on GitHub if you think Aim is useful. ⭐️

Aim v3.16 — Run messages in UI, TensorBoard real-time sync, integration with Hugging Face Datasets was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Aim and MLflow — Choosing Experiment Tracker for Zero-Shot Cross-Lingual Transfer

Hovhannes Tamoyan — Mon, 13 Feb 2023 20:31:05 GMT

Aim and MLflow — Choosing Experiment Tracker for Zero-Shot Cross-Lingual Transfer

The release of aimlflow sparked user curiosity, a tool that facilitates seamless integration of a powerful experiment tracking user interface of Aim with MLflow logs.

The question arises as to why we need aimlflow or why we need to view MLflow tracked logs on Aim. The answer is that Aim provides a highly effective user interface that reveals the potential for gaining valuable insights.

In this blog post, we will address the zero-shot cross-lingual transfer task in NLP, and subsequently, monitor the metadata using both Aim and MLflow. Finally, we will attempt to obtain insightful observations from our experiments through the utilization of their respective user interfaces.

Task Setup

The task that we will be tackling in this scope is the zero-shot cross-lingual transfer. Zero-shot cross-lingual transfer refers to a machine learning technique where a model is trained in one language but is able to perform well in another language without any additional fine-tuning. This means that the model has the ability to generalize its understanding of the task across different languages without the need for additional training data.

Particularly, we will train a model for the natural language inference task (NLI) on the English dataset and then classify the label of a given sample written in a different language, without additional training. This approach is useful in situations where labeled data is scarce in some languages and plentiful in others.

Zero-shot cross-lingual transfer can be achieved through various methods, including cross-lingual word embeddings and shared multilingual representations learned in a common space (multilingual language model), the latter is widely used.

We will explore two techniques in our experimentation:

Fine-tuning the entire pre-trained multilingual language model. Adjusting the weights of the network to solve the given classification task, resuming training from the last state of a pre-trained model.
Feature extraction which refers to attaching a classification head on top of the pre-trained language model and only training that portion of the network.

In both techniques, we will undertake training utilizing the en subset of the XNLI dataset. Following this, we will conduct evaluations on our evaluation set, consisting of six language subsets from the XNLI dataset, including English(en), German(de), French(fr), Spanish(es), Chinese(zh), and Arabic(ar).

The Datasets

We will utilize the XNLI (cross-lingual NLI) dataset, which is a selection of a few thousand examples from the MNLI (multi NLI) dataset, translated into 14 different languages, including some with limited resources.

The template of the NLI task is as follows. Given a pair of sentences, a premise and a hypothesis need to determine whether a hypothesis is true (entailment), false (contradiction), or undetermined (neutral) given a premise.

Let’s take a look at a few samples to get hang of it. Say the given hypothesis is “Issues in Data Synthesis.” and the premise is “Problems in data synthesis.”. Now there are 3 options whether this hypothesis entails the premise, contradicts, or is neutral. In this pair of sentences, it is obvious that the answer is entailment because the words issues and problems are synonyms and the term data synthesis remains the same.

Another example this time of a neutral pair of sentences is the following: the hypothesis is “She was so happy she couldn't stop smiling.” and the premise is “She smiled back.”. The first sentence doesn’t imply the second one, however, it doesn’t contradict it as well. Thus they are neutral.

An instance of contradiction, the hypothesis is “The analysis proves that there is no link between PM and bronchitis.” and the premise is “This analysis pooled estimates from these two studies to develop a C-R function linking PM to chronic bronchitis.”. In the hypothesis, it is stated that the analysis shows that there is no link between two biological terms. Meanwhile, the premise states the opposite, that the analysis combined two studies to show that there is a connection between the two terms.

For more examples please explore the HuggingFace Datasets page powered by Streamlit: https://huggingface.co/datasets/viewer/.

The Models

In our experiments, we will utilize the following set of pre-trained multilingual language models:

https://medium.com/media/cda70e39830927f93c14993e985b27b7/href

We will load each model with its last state weights and continue training the entire network (fine-tuning) or the classification head only (feature extraction) from that state. All of the mentioned models are trained with the Masked Language Modeling (MLM) objective.

Setting up Training Environment

Before beginning, it is important to keep in mind that the following is what the ultimate structure of our directory will resemble:

aim-and-mlflow-usecase
├── logs
│   ├── aim_callback
│   │   └── .aim
│   ├── aimlflow
│   │   └── .aim
│   ├── checkpoints
│   └── mlruns
└── main.py

Let’s start off by creating the main directory, we named it aim-and-mlflow-usecase, you can simply name anything you want. After which we need to download the main.py from the following source: https://github.com/aimhubio/aimlflow/tree/main/examples/cross-lingual-transfer. The code explanation and sample usage can be found in the README.md file of the directory. We will be using this script to run our experiments.

The logs directory as the name suggests stores the logs. In the checkpoints folder, all the model states will be saved. The mlruns is the repository for MLflow experiments. The aim_callback will store the repository of Aim runs tracked using Aim’s built-in callback for Hugging Face Transformers, meanwhile, the aimlflow will store the runs converted from MLflow using the aimlflow tool.

It is important to keep in mind that due to limited computational resources, we have chosen to use only 15,000 samples of the training dataset and will be training for a mere 3 epochs with a batch size of 8. Consequently, the obtained results may not be optimal. Nevertheless, the aim of this use case is not to achieve the best possible results, but rather to showcase the advantages of using both Aim and MLflow.

In order to start the training process, we will be using the following command. But first, let’s navigate to the directory where our script is located (aim-and-mlflow-usecase in our case).

python main.py feature-extraction \
    --model-names bert-base-multilingual-cased bert-base-multilingual-uncased xlm-roberta-base distilbert-base-multilingual-cased \
    --eval-datasets-names en de fr es ar zh \
    --output-dir {PATH_TO}/logs

Where {PATH_TO} is the absolute path of the aim-and-mlflow-usecase directory. In this particular command, we use the feature-extraction technique for 4 pre-trained models and validate on 6 languages of the XNLI dataset. In parallel or after the first process completion we can run the same command this time using the fine-tuning technique:

python main.py fine-tune \
    --model-names bert-base-multilingual-cased bert-base-multilingual-uncased xlm-roberta-base distilbert-base-multilingual-cased \
    --eval-datasets-names en de fr es ar zh \
    --output-dir {PATH_TO}/logs

Go grab some snacks, trainings take a while 🍫 ☺️.

Using aimlflow

Meanwhile, one might wonder why we are tracking the experiment results using MLflow and Aim. We could simply track the metrics via MLflow and use the aimlflow to simply convert and view our experiments live on Aim. Let’s first show how this can be done after which tackle the question.

Instal aimlflow on your machine via pip, if it is not already installed:

$ pip install aimlflow

$ aimlflow sync --mlflow-tracking-uri=logs/mlruns --aim-repo=logs/aimlflow/.aim

This command will start converting all of the experiment hyperparameters, metrics, and artifacts from MLflow to Aim, and continuously update the database with new runs every 10 seconds.

More on how the aimlflow can be set up for local and remote MLflow experiments can be found in these two blog posts respectively:

This is one approach to follow, but for a more improved user interface experience, we suggest utilizing Aim’s built-in callback, aim.hugging_face.AimCallback, which is specifically designed for transformers.Trainer functionality. It tracks a vast array of information, including environment details, packages and their versions, CPU, and GPU usage, and much more.

Unlocking the Power of Data Analysis

Once the training completes some steps, we can start exploring the experiments and comparing the MLflow and Aim user interfaces. First, let’s launch both tools’ UIs. To do this, we need to navigate to the logs directory and run the MLflow UI using the following command:

$ mlflow ui

$ aim up

Note that for the best possible experience, we will be using the aim_callback/.aim repository in this demonstration, as it has deeper integration with the Trainer.

The UIs of both MLflow and Aim will be available by default on ports 5000 and 43800 respectively. You can access the homepages of each tool by visiting http://127.0.0.1:5000 for MLflow and http://127.0.0.1:43800 for Aim.

The user interface of MLflow on first look

The user interface of Aim on first look

In order to gain valuable insights from the experiment results, it is imperative to navigate to the proper pages in both user interfaces. To do so, in MLflow, we can visit the Compare Runs page:

By selecting all the experiments, navigate to the comparison page by clicking the Compare button.

The Run details section presents the run metadata, including the start and end time and duration of the run. The Parameters section displays the hyperparameters used for the run, such as the optimizer and architecture. The Metrics section showcases the latest values for each metric.

Having access to all this information is great, but what if we want to explore the evolution of the metrics over time to gain insights into our training and evaluation processes? Unfortunately, MLflow does not offer this functionality. However, Aim does provide it, to our advantage.

To organize the parameters into meaningful groups for our experiment, simply go to Aim’s Metrics Explorer page and follow a few straightforward steps:

Gaining insights

Let’s examine the charts more closely and uncover valuable insights from our experiments.

A quick examination of the charts reveals the following observations:

The fine-tuning technique is clearly superior to feature extraction in terms of accuracy and loss, as evidenced by a comparison of the maximum and minimum results in charts 1 and 3, and charts 2 and 4, respectively.
The graphs for feature extraction show significant fluctuations across languages, whereas the results for fine-tuning vary greatly with changes to the model. To determine the extent of the variation, we can consolidate the metrics by removing either the grouping by language (color) or model (stroke), respectively. In this case, we will maintain the grouping by model name to examine the variation of each model for a given technique.

Even though we have only trained a single model with a larger batch size (16, instead of the default 8), it is still valuable to examine the trend. To accomplish this, we will eliminate the grouping by model name and group only by train_batch_size. As we can observe, after only 2500 steps, there is a trend of decreasing loss and increasing accuracy at a quicker pace. Thus, it is worth considering training with larger batch sizes.

The charts unmistakably show that the bert-base-multilingual-cased model achieved the best accuracy results, with the highest score observed for the en subset, as the model was trained on that subset. Subsequently, es, fr, de, zh, and ar followed. Unsurprisingly the scores for the zh and ar datasets were lower, given that they belong to distinct linguistic families and possess unique syntax.

Let us examine the training times and efficiencies. By setting the x-axis to align with relative time rather than default steps, we observe that the final tracking point of the fine-tuning technique took almost 25% more time to complete compared to the feature-extraction technique.

One can continue analysis further after training with bigger batch sizes and more variations of the learning rate, models, etc. The Parameter Explorer will then lend its aid in presenting intricate, multi-layered data in a visually appealing, multi-dimensional format. To demonstrate how the Parameter Explorer works, let’s pick the following parameters: train_batch_size, learning_rate,_name_or_path, loss, and the accuracies of sub_datasets. The following chart will be observed after clicking the Search button. From here we can see that the run which resulted in the highest accuracies for all subsets has a final loss value equal to 0.6, uses the bert-base-multilingual-cased model, with 5·10⁻⁵ learning_rate and the batch_size is 8.

Taking into account the aforementioned insights, we can move forward with future experiments. It is worth noting that while fine-tuning results improved accuracy scores, it requires a slightly longer training time. Increasing the batch size and training for longer steps/epochs is expected to further enhance the results. Furthermore, fine-tuning other hyperparameters such as the learning rate, weight decay, and dropout will make the experiments set more diverse and may lead to even better outcomes.

Conclusion

This blog post demonstrates how to solve an NLP task, namely zero-shot cross-lingual transfer while tracking your run metrics and hyperparameters with MLflow and then utilizing Aim’s powerful user interface to obtain valuable insights from the experiments.

We also showcased how to use the aimlfow which has piqued user interest as a tool that seamlessly integrates the experiment tracking user interface of Aim with MLflow logs.

Aim and MLflow — Choosing Experiment Tracker for Zero-Shot Cross-Lingual Transfer was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

How to integrate aimlflow with your remote MLflow

Hovhannes Tamoyan — Mon, 30 Jan 2023 18:22:26 GMT

We are thrilled to unveil aimlflow, a tool that allows for a smooth integration of a robust experiment tracking UI with MLflow logs! 🚀

With aimlflow, MLflow users can now seamlessly view and explore their MLflow experiments using Aim’s powerful features, leading to deeper understanding and more effective decision-making.

We have created a dedicated post on the setup of aimlflow on local environment. For further information and guidance, please refer to the following link:

Running Aim on the local environment is pretty similar to running it on the remote. See the guide on running multiple trainings using Airflow and exploring results through the UI here: https://medium.com/aimstack/exploring-mlflow-experiments-with-a-powerful-ui-238fa2acf89e

In this tutorial, we will showcase the steps required to successfully use aimlflow to track experiments on a remote server.

Project overview

We will use PyTorch and Ray Tune to train a simple convolutional neural network (CNN) on the Cifar10 dataset. We will be experimenting with different sizes for the last layers of the network and varying the learning rate to observe the impact on network performance.

We will use PyTorch to construct and train the network, leverage Ray Tune to fine-tune the hyperparameters, and utilize MLflow to meticulously log the training metrics throughout the process.

Find the full project code on GitHub: https://github.com/aimhubio/aimlflow/tree/main/examples/hparam-tuning

Server-side/Remote Configuration

Let’s create a separate directory for the demo and name it mlflow-demo-remote. After which download and run the tune.py python script from the Github repo to conduct the training sessions:

$ python tune.py

Ray Tune will start multiple trials of trainings with different combinations of the hyperparameters, and yield a similar output on the terminal:

Once started, mlflow will commence recording the results in the mlruns directory. Our remote directory will have the following structure:

mlflow-demo-remote
├── tune.py
└── mlruns
    ├── ...

Let’s open up the mlfow UI to explore the runs. To launch the mlflow user interface, we simply need to execute the following command from the mlflow-demo-remote directory:

$ mlflow ui --host 0.0.0.0

By default, the --host is set to 127.0.0.1, limiting access to the service to the local machine only. To expose it to external machines, set the host to 0.0.0.0.

By default, the system listens on port 5000.

One can set --backend-store-uri param to specify the URI from which the source will be red, wether its an SQLAlchemy-compatible database connection or a local filesystem URI, by default its the path of mlruns directory.

Upon navigating to http://127.0.0.1:5000, you will be presented with a page that looks similar to this:

Synchronising MLflow Runs with Aim

After successfully initiating our training on the remote server and hosting the user interface, we can begin converting mlflow runs from the remote to our local Aim repository.

First, let’s move forward with the installation process of aimlflow. It’s incredibly easy to set up on your device, just execute the following command:

$ pip install aim-mlflow

After successfully installing aimlflow on your machine let’s create a directory named mlflow-demo-local where the .aimrepository will be initialized and navigate to it. Then, initialize an empty aim repository by executing the following simple command:

$ aim init

This will establish an Aim repository in the present directory and it will be named .aim.

This is how our local system directory will look like:

mlflow-demo-local
└── .aim
    ├── ...

In order to navigate and explore MLflow runs using Aim, the aimlflow synchroniser must be run. This will convert and store all metrics, tags, configurations, artifacts, and experiment descriptions from the remote into the .aim repository.

To begin the process of converting MLflow experiments from the the hosted url YOUR_REMOTE_IP:5000 into the Aim repository .aim, execute the following command from our local mlflow-demo-local directory:

$ aimlflow sync --mlflow-tracking-uri='http://YOUR_REMOTE_IP:5000' --aim-repo=.aim

The converter will go through all experiments within the project and create a unique Aim run for each experiment with corresponding hyperparameters, tracked metrics and the logged artifacts. This command will periodically check for updates from the remote server every 10 seconds, and keep the data syncronized between the remote and the local databases.

This means that you can run your training script on your remote server without any changes, and at the same time, you can view the real-time logs on the visually appealing UI of Aim on your local machine. How great is that? ☺️

Now that we have initialized the Aim repository and have logged some parameters, we simply need to run the following command:

$ aim up

to open the user interface and explore our metrics and other information.

For further reading please referee to Aim documentation where you will learn more about the superpowers of Aim.

Conclusion

To sum up, Aim brings a revolutionary level of open-source experiment tracking to the table and aimlflow makes it easily accessible for MLflow users with minimal effort. The added capabilities of Aim allow for a deeper exploration of remote runs, making it a valuable addition to any MLflow setup.

In this guide, we demonstrated how the MLflow remote runs can be explored and analyzed using the Aim. While both tools share some basic functionalities, the Aim UI provides a more in-depth exploration of the runs.

The added value of Aim makes installing aimlflow and enabling the additional capability well worth it.

Learn more

If you have any questions join Aim community, share your feedback, open issues for new features and bugs. 🙌

Show some love by dropping a ⭐️ on GitHub, if you think Aim is useful.

How to integrate aimlflow with your remote MLflow was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Aim 2022; Product Recap

Gor Arakelyan — Fri, 20 Jan 2023 16:46:04 GMT

A retrospective look at the past year

Aim underwent significant improvements in the past year. Over the course of 50 releases, including 12 minor releases and 38 patch releases, Aim received contributions from 35 total contributors, including 18 new external contributors. These contributions included 769 merged pull requests and 487 submitted issues!

Aim contributions pulse

There was a significant progress in various areas of the Aim, including adding more supported platforms, taking the UI to the whole new level, adding remote tracking capabilities, revamping the CLI, extending the QL and moving towards active monitoring for optimizing large-scale trainings.

In addition to adding new features and extending Aim, improvements in performance and usability were applied to enhance the overall user experience. Furthermore, new integrations with machine learning tools were implemented to make it easier to use Aim in a broader technology stack

Overall, these improvements greatly enhanced the functionality and usability of Aim for tracking machine learning experiments for both small and large scale projects.

Aim Community

Aim contributors

We are on a mission to democratize AI dev tools!

Numerous external contributions were pushed to “main” in 2022. Without community support, Aim would not be where it is today.

Congratulations to Sharathmk99, YodaEmbedding, hjoonjang, jangop, djwessel, GeeeekExplorer, dsblank, timokau, kumarshreshtha, kage08, karan2801, hendriks73, yeghiakoronian, uduse, arnauddhaene, lukoucky, Arvind644, shrinandj for their first contributions over the past year! 🎉

These contributions have covered a range of areas, including documentation, user guides, the SDK and the UI!

Highlights

Supported platforms were expanded to include Docker, Apple M1, Conda, Google Colab and Jupyter Notebooks.
The UI was given a makeover, with upgraded home and run single pages, as well as brand new experiment page. The new audio and figure explorers now allow for the exploration of more types of metadata.
Aim remote tracking server was rolled out to enable tracking from multi-host environments.
Active monitoring capabilities were enabled, such as notifications for stalled runs, and the ability to programmatically define training heuristics using callbacks and notifications.
Aim was integrated with 9 different machine learning frameworks alongside with 3 convertors that help to migrate from other experiment trackers.
Last but not least, the documentation was completely revamped, including almost 40 pages of guides towards helping in effectively using and understanding Aim.

Supported Platforms

Improvements were made to ensure Aim works seamlessly with widely used environments and platforms in ML/MLOps field, including adding support for new platforms.

Support for running Aim on Apple M1 devices.
Support for using Aim inside Google Colab and Jupyter notebooks.
Docker images for Aim UI and server.
Support for running Aim in Kubernetes clusters.
Aim became available in Conda.

User Interface

The UI is one of key interfaces to interact with the ML training runs tracked by Aim. It’s super-powerful for comparing large number of trainings. The UI was significantly improved in the past year with new pages and explorers to help to explore more metadata types.

Home page

Revamped home page to see project’s contributions and overall statistics.
Revamped run page to deep-dive into an individual run.
Brand new experiment page to view experiment’s details, attached runs, etc.
New explorers, such as the “Figures Explorer” and “Audio Explorer”, to allow a cross-run comparison of Plotly figures and audio objects respectively.
“Metrics Explorer” key enhancements, including displaying chart legends and the ability of exporting charts as images.

Command Line Interface

Aim CLI offers a simple interface to easily manage tracked trainings. Two key groups of commands were added:

Runs management to enable the base operations for managing trainings — list, copy, remove, move, upload, close.
Storage management to help to manage Aim data storage, such as reindexing, pruning, managing data format versions (advanced usage).

Query Language

Aim enables a powerful query language to filter through all the stored metadata using python expressions. A few additions were done providing a more friendly and pythonic interface for working with date-time expressions, as well as querying runs based on its metrics results.

Remote Tracking

The Aim remote tracking server, which allows running trainings in a multi-host environment and collect tracked data in a centralized location, used to be one of the most requested features.

It has been gradually rolled out from its experimental phase to a stable version that can scale up and handle an increasing number of parallel trainings.

Switching from a local environment to a centralized server is as easy as pie, as zero code changes are required.⚡
This is because the interfaces are completely compatible!

Active Monitoring

Aim made it first steps towards optimizing conducting long and large scale trainings, such as training or tuning GPT-like models. It reduces human-hours spent monitoring/restarting runs, improves reproducibility and ramps up time for people who need to train these models:

Ability to notify on stalled/stuck runs.
Callbacks and notifications to define training heuristics programmatically.

A quote from “Scaling Laws for Generative Mixed-Modal Language Models” (Aghajanyan et al., 2023, arXiv:2301.03728):

We tracked all experiments using the Aim experiment tracker (Arakelyan et al., 2020). To ensure consistent training strategies across our experiments, we implemented a model restart policy using the Aim experiment tracker and callbacks. Specifically, if training perplexities do not decrease after 500 million tokens, the training run is restarted with a reduced learning rate with a factor of 0.8 of the current time step. This policy helps remove variance in the scaling laws due to differences in training procedures and allows us to scale up the number of asynchronous experiments significantly. All experiments were conducted in a two-month time frame with a cluster of 768 80GB A100 GPUs. The majority of experiments used 64 GPUs at a time.

Key Performance and Usability Optimizations

Storage: long metrics sampling and retrieval (>1M steps).
Storage: robust locking mechanism and automatic background indexing.
UI: The v3.12 milestone was released, addressing over 30 usability issues.
UI: virtualizations in various places across the UI to improve responsiveness when displaying large amount of data.
UI: optimizations in stream decoding and data encoding to enhance overall performance.
UI: live update optimizations to effectively update and display real-time data.

Integrations

Aim easily integrates with a large number of widely adopted machine learning frameworks and tools, reducing barriers to get started with Aim. During the past year, a number of integrations were added:

ML Frameworks — Pytorch Ignite, CatBoost, LightGBM, KerasTuner, fastai, MXNet, Optuna, PaddlePaddle
Convertors to easily migrate from other tools — TensorBoard to Aim, MLFlow to Aim, WandB to Aim.
Tools that integrated Aim — Meta’s fairseq and metaseq frameworks, HuggingFace’s accelerate, Ludwig and Kedro.

Learn More

If you have any questions join Aim community, share your feedback, open issues for new features and bugs. 🙌

Show some love by dropping a ⭐️ on GitHub, if you find Aim useful.

Aim 2022; Product Recap was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Exploring MLflow experiments with a powerful UI

Gor Arakelyan — Wed, 18 Jan 2023 18:55:15 GMT

We are excited to announce the release of aimlflow, an integration that helps to seamlessly run a powerful experiment tracking UI on MLflow logs! 🎉

MLflow is an open source platform for managing the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. While MLflow provides a great foundation for managing machine learning projects, it can be challenging to effectively explore and understand the results of tracked experiments. Aim is a tool that addresses this challenge by providing a variety of features for deeply exploring and learning tracked experiments insights and understanding results via UI.

With aimlflow, MLflow users can now seamlessly view and explore their MLflow experiments using Aim’s powerful features, leading to deeper understanding and more effective decision-making.

In this article, we will guide you through the process of running a several CNN trainings, setting up aimlfow and exploring the results via UI. Let’s dive in and see how to make it happen.

Aim is an easy-to-use open-source experiment tracking tool supercharged with abilities to compare 1000s of runs in a few clicks. Aim enables a beautiful UI to compare and explore them.

View more on GitHub: https://github.com/aimhubio/aim

Project overview

We use a simple project that trains a CNN using PyTorch and Ray Tune on the CIFAR-10 dataset. We will train multiple CNNs by adjusting the learning rate and the number of neurons in two of the network layers using the following stack:

PyTorch for building and training the model
Ray Tune for hyper-parameters tuning
MLflow for experiment tracking

Find the full project code on GitHub: https://github.com/aimhubio/aimlflow/tree/main/examples/hparam-tuning

Run the trainings by downloading and executing tune.py python file:

python tune.py

You should see a similar output, meaning the trainings are successfully initiated:

Getting started with aimlflow

After the hyper-parameter tuning is ran, let’s see how aimlflow can help us to explore the tracked experiments via UI.

To be able to explore MLflow logs with Aim, we will need to convert MLflow experiments to Aim format. All the metrics, tags, config, artifacts, and experiment descriptions will be stored and live-synced in a .aim repo located on the file system.

This means that you can run your training script, and without modifying a single line of code, live-time view the logs on the beautiful UI of Aim. Isn’t it amazing? 🤩

1. Install aimlflow on your machine

It is super easy to install aimlflow, simply run the following command:

pip3 install aim-mlflow

2. Sync MLflow logs with Aim

Pick any directory on your file system and initialize a .aim repo:

aim init

Run the aimlflow sync command to sync MLflow experiments with the Aim repo:

aimlflow sync --mlflow-tracking-uri=MLFLOW_URI --aim-repo=AIM_REPO_PATH

3. Run Aim

Now that we have synced MLflow logs and we have some trainings logged, all we need to do is to run Aim:

aim up --repo=AIM_REPO_PATH

You will see the following message on the terminal output:

Congratulations! Now you can explore the training logs with Aim. 🎉

Quick tour of Aim for MLflow users

In this section, we will take a quick tour of Aim’s features, including:

Exploring hyper-parameters tuning results
Comparing tracked metrics
As well as, taking a look at the other capabilities Aim provides

Exploring MLflow experiments

Now then Aim is set up and running, we navigate to the project overview page at 127.0.0.1:43800, where the summary of the project is displayed:

The number of tracked training runs and experiments
Statistics on the amount of tracked metadata
A list of experiments and tags, with the ability to quickly explore selected items
A calendar and feed of contributions
A table of in-progress trainings

Project overview

To view the results of the trainings, let’s navigate to the runs dashboard at 127.0.0.1:43800/runs. Here, you can see hyper-parameters and metrics results all of the trainings.

Runs dashboard

We can deeply explore the results and tracked metadata for a specific run by clicking on its name on the dashboard.

Run page

On this page, we can view the tracked hparams, including the mlflow_run_id and the mlflow_run_name which are extracted from MLflow runs during the conversion process. Additionally, detailed information about the run can be found on each of the tabs, such as tracked hparams, metrics, notes, output logs, system resource usage, etc.

Comparing metrics

Comparing metrics across several runs is super easy with Aim:

Open the metrics page from the left sidebar
Select desired metrics by clicking on + Metrics button
Pressing Search button on the top right corner

We will select losses and accuracies to compare them over all the trials. The following view of metrics will appear:

Aim comes with powerful grouping capabilities. Grouping enables a way to divide metrics into subgroups based on some criteria and apply the corresponding style. Aim supports 3 grouping ways for metrics:

by color — each group of metrics will be filled in with its unique color
by stroke style — each group will have a unique stroke style (solid, dashed, etc)
by facet — each group of metrics will be displayed in a separate subplot

To learn which set of trials performed the best, let’s apply several groupings:

Group by run.hparams.l1 hyper-parameter to color the picked metrics based on the number of outputs of the first fully connected layer
Group by metric.name to divide losses and accuracies into separate subplots (this grouping is applied by default)

Aim also provides a way to select runs programmatically. It enables applying a custom query instead of just picking metrics of all the runs. Aim query supports python syntax, allowing us to access the properties of objects by dot operation, make comparisons, and perform more advanced python operations.

For example, in order to display only runs that were trained for more than one iteration, we will run the following query:

run.metrics['training_iteration'].last > 1

Read more about Aim query language capabilities in the docs: https://aimstack.readthedocs.io/en/latest/using/search.html

This will result in querying and displaying 9 matched runs:

Let’s aggregate the groups to see which one performed the best:

From the visualizations, it is obvious that the purple group achieved better performance compared to the rest. This means that the trials that had 64 outputs in the first fully connected layer achieved the best performance. 🎉

For more please see Aim official docs here: https://aimstack.readthedocs.io/en/latest/

Last, but not least: a closer look at Aim’s key features

Use powerful pythonic search to select the runs you want to analyze:

Group metrics by hyperparameters to analyze hyperparameters’ influence on run performance:

Select multiple metrics and analyze them side by side:

Aggregate metrics by std.dev, std.err, conf.interval:

Align x axis by any other metric:

Scatter plots to learn correlations and trends:

High dimensional data visualization via parallel coordinate plot:

Explore media metadata, such as images and audio objects via Aim Explorers:

Audio Explorer

Images Explorer

Conclusion

In conclusion, Aim enables a completely new level of open-source experiment tracking and aimlflow makes it available for MLflow users with just a few commands and zero code change!

In this guide, we demonstrated how MLflow experiments can be explored with Aim. When it comes to exploring experiments, Aim augments MLflow capabilities by enabling rich visualizations and manipulations.

We covered the basics of Aim, it has much more to offer, with super fast query systems and a nice visualization interface. For more please read the docs.

Learn more

If you have any questions join Aim community, share your feedback, open issues for new features and bugs. 🙌

Show some love by dropping a ⭐️ on GitHub, if you think Aim is useful.

Exploring MLflow experiments with a powerful UI was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.

Aim 2022 community report

Gev Sogomonian — Fri, 06 Jan 2023 18:07:40 GMT

Hey lovely Aimers, Happy New Year! 🎆

Now that the 2022 is over, we are excited to share what has happened in 2022. We have achieved a lot together!

TLDR — Highlights

In 2022 we have taken Aim to completely different level. It has become from a cool open-source project into a full-blown product. An inspiring community-led effort!!

We have shipped 12 new Aim versions and countless other patches
Almost every week for the past 80+ weeks a new Aim version has been shipped — a blistering pace of innovation!! 😅
Aim contributors have doubled to 48 in 2022
~1300 GitHub issues have been created in 2022 and over 1100 of them have been resolved.
We have reached 3K stars on GitHub. ⭐️ (If you like our work, drop us a star! )
We launched our community on discord.

Product

We have shipped 100s of major features in 2022. It would take several chapters to talk about all of them.

However, here are two fundamental features that we haven’t talked much about yet.

Base Explorer

A foundational work to help compare any type of renderable ML metadata at mass.
This is the first experimental step we have made to turn Aim into a one-stop-shop for multidimensional metadata analysis.

The Figures explorer runs on Base Explorer

Aim Callbacks

A surgical intervention into your ML runs. For everything!!

Aim Users are using the callbacks to create policies to automatically manage trainings. A major step towards reproducibility and true training management. Hours of GPU-time saved at the worst case scenario!!

We are going to talk a lot about the callbacks too! 🙌

More about callbacks here

Aim Roadmap

Here is how the public Aim roadmap looks like now. This is just the subset of the additions made. A monumental work!

Integrations

Aim is heavily integrated with the AI ecosystem now.

in 2022 Aim ecosystem integrations have grown 3x
~10 more integrations are also in progress right now ❤️

Lots of impactful repos have adopted Aim as well. Including these ones: HF Accelerate, FAIR Metaseq (default tracker), FAIR Fairseq, Ludwig etc
Community-requested Aim converters for Mlflow, TensorBoard and Wandb.

Onwards to 2023

In 2023 we are going to work hard to put Aim into the hands of as many researchers and MLOps engineers as possible!

Soon we will publish the Aim roadmap of 2023! Lots to share. :)

2023 here we come! 😊

Aim 2022 community report was originally published in AimStack on Medium, where people are continuing the conversation by highlighting and responding to this story.