Importing data to Calkulate¶
All the code snippets below assume the import convention:
import calkulate as calk
Titration metadata Dataset¶
Calkulate works by adding extra functions that can be used on a standard pandas DataFrame. It also provides an optional Dataset class which adds these functions as additional methods to a DataFrame. A Dataset contains the metadata for any number of separate titrations.
Convert from a DataFrame¶
If your titration metadata is already imported as a DataFrame, you can directly convert it into a Dataset:
ds = calk.Dataset(ds)
Import from Excel, CSV or dbs¶
Or if your titration metadata is in an Excel spreadsheet, a CSV file, or another format that can be imported with pandas, you can import these with the appropriate function from below:
ds = calk.read_csv("path/to/metadata_file.csv", **read_csv_kwargs)
ds = calk.read_excel("path/to/metadata_file.xlsx", **read_excel_kwargs)
ds = calk.read_fwf("path/to/metadata_file.txt", **read_fwf_kwargs)
ds = calk.read_table("path/to/metadata_file.txt", **read_table_kwargs)
These are all light wrappers around the pandas functions of the same names. You can add any kwargs needed by the pandas functions and they will be passed on too.
If you have a VINDTA-generated .dbs file, you can import this too:
ds = calk.read_dbs("path/to/medatadata_file.dbs")
read_dbs
also renames several columns from their defaults to the names expected by Calkulate, parses analysis dates and times, and predicts the file names based on the data in the file.
The columns in the Dataset must be named in a specific way for Calkulate to be able to use their data.
Individual titration data files¶
You don't need to manually import titration data files, but you should check that their format is consistent with what Calkulate expects, and work out what extra arguments you'll need to pass if not.
The information they contain¶
Each titration file is a text file containing the measurements of the solution carried out during a titration. The file should contain data in columns, where each row represents a measurement after a separate titrant addition. There must be at least three columns:
- The amount of titrant added to the analyte in ml or g.
- The EMF measured across the titrant-analyte mixture in mV.
- The temperature of the titrant-analyte mixture in °C.
Their default format¶
By default, Calkulate expects that:
- There are two lines at the start of the file to be ignored before the data columns described above begin.
- The columns appear next, in the order given (from left to right).
- The columns are tab-delimited.
- Nothing comes after the columns of titration data in the file.
For example, a file could contain the following:
This first line is ignored.
This second line is also ignored.
0.00 183.1 25.2
0.50 225.4 25.1
1.00 290.3 25.0
1.50 343.4 25.1
If your files look like the above (e.g. they were generated by a VINDTA), then you don't need to read any further on this page.
Other formats¶
Alternatively, you may have files in a different format, for example generated directly by a Metrohm Titrino unit. These .txt files typically have names beginning with PC_LIMS_Report_ and the titration data is found in six columns somewhere in the middle of the file. These files can be imported by Calkulate too: when you run the calibrate
, solve
or calkulate
functions, you just need to include read_dat_method="pclims"
as a kwarg (or if using the dataset approach, add a read_dat_method
column to your metadata table).
If your titration data files arrive in some other format, it's quite straightforward to add a new read_dat_method
option that will allow them to be imported directly. If this applies to you, please just create an Issue on the GitHub repo, attaching an example of the file you need to import.