How to create an AI Model
Welcome to Cradl AI!
In this guide, you will train an AI model to automatically extract data from your image or PDF documents into JSON, ready for export to your other apps.
The guide takes 15 minutes to 1 hour, depending on how thoroughly you train your model. For those who prefer a visual learning experience, the video tutorial below covers the same content:
Create a free Cradl AI account
Start by completing the sign-up to a free Cradl AI account. You will be redirected to your Models after completing the sign up.
• Click on New Model
Create a new AI Model
Welcome to your model overview. It consits of AI Model, Validation, Trigger, and Export. You only need to configure AI Model to start parsing your documents.
• Click on AI Model
What type of document do you have?
Invoices, Pruchase Orders, Bills of Lading, ID Cards, contracts, etc. - there are many types of documents in the world. What type of document do you want to parse?
Start with a template if we have template for your document type, if not, start from scratch.
Model from template vs. model from scratch
Whether to start from scratch or whether to start from a template depends on what you want to extract from your documents.
For example, you may have invoices from which you want extract fields such as a total amount, a supplier name, an invoice number, and so on. If you browse through the templates, you will find an invoice template that have those fields. In that case, you should start with a template instead of from scratch. You can easily add and remove fields from templates if necessary.
If your documents are unlike any of our templates, then choose to start from scratch.
Which fields do you want to extract from your documents?
For every piece of information you want to extract from your documents, you create a corresponding field in your AI Model, so make a list of the data you want to extract from your documents. It is recommended to only include data that you need when training your first model. You can introduce data that is nice to have once you are getting good results from your initial model.
Your amount of fields is flexible; you can start with a limited amount and introduce new fields in the future if neccessary.
See example
Name your fields
A field consists of a title and a data type.
A field's title is yours to decide. It does not have to correspond to what you see on your documents. For example, if your documents have a total amount of 20.000 labelled as "total: 20.000", you are free to call your field's name total, total amount, price, or something in another language.
Avoid changing a field's title once you have started training your model (we will train the model in the next section of this tutorial). Changing the title of your fields after training will require you to retrain that specific field.
Set your field types
- String
- Amount
- Date
- Numeric
- Line items
- Classification
String is the most commonly used type. It is used for data containing letters or a mix of letters and numbers.
A common usecase for strings are addresses, which you can create in one string field instead of making multiple fields for each line of the address:
Use the Amount type if your document has monetary amounts.
Use the Date type if your document has dates.
Numeric is used for numbers, such as quantities, numeric IDs, etc. A Numeric field does not accept letters.
Line items are used for single and multiline tabular data:
The 'Classification' type is ideal when your model must predict one of several pre-defined classes. For instance, you might include a classification field with two classes, INVOICE and CREDIT_NOTE, to discern whether a document is an invoice or a credit note. Classification is often also recommended for reading out currencies from documents.
• Click Create when you have finished configuring your fields.