Troubleshooting and FAQ
This page addresses common issues and frequently asked questions when using the LorenzCycleToolkit.
Track File Issues
Track file has more timesteps than input data
Problem: Your track file extends beyond the temporal coverage of your NetCDF data file.
Example:
Track file: 2020-01-01 00:00 to 2020-01-10 23:00
Input data: 2020-01-01 00:00 to 2020-01-08 18:00
Error message:
❌ Track final timestamp (2020-01-10 23:00:00) is later than data final timestamp
(2020-01-08 18:00:00). Please adjust the track file or re-download the data.
Solutions:
Trim your track file to match the data coverage:
import pandas as pd # Load track file track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time') # Trim to match data period track_trimmed = track['2020-01-01':'2020-01-08 18:00'] # Save trimmed track track_trimmed.to_csv('inputs/track_trimmed', sep=';')
Download additional data to cover the full track period (if using CDS API):
python lorenzcycletoolkit.py extended_data.nc -t -r --cdsapi --trackfile inputs/track
Use a pre-downloaded file that covers the full period
Track file has fewer timesteps than input data
Problem: Your track file covers a shorter period than your input data.
Example:
Track file: 2020-01-03 00:00 to 2020-01-05 18:00
Input data: 2020-01-01 00:00 to 2020-01-10 23:00
Behavior: This is perfectly fine! The toolkit will automatically use only the timesteps from the input data that match the track file period.
What happens:
# The toolkit automatically does this:
data = data.sel(time=track.index.values)
Your analysis will run from 2020-01-03 00:00 to 2020-01-05 18:00, using only the relevant portion of the input data.
Use case: This is useful when you have a large dataset but only want to analyze a specific event or time period.
Track file has timesteps not present in data
Problem: Some track timesteps don’t match any data timesteps (e.g., track has 1-hour intervals but data has 3-hour intervals).
Example:
Track timesteps: 00:00, 01:00, 02:00, 03:00, 04:00, ...
Data timesteps: 00:00, 03:00, 06:00, 09:00, ...
Error message:
KeyError: 'Some timesteps from track not found in data'
Solutions:
Resample your track file to match data temporal resolution:
import pandas as pd # Load track track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time') # Resample to 3-hour intervals (matching data) track_resampled = track.resample('3H').first() # or .interpolate() for interpolation # Remove any NaN rows track_resampled = track_resampled.dropna() # Save resampled track track_resampled.to_csv('inputs/track_resampled', sep=';')
If using CDS API, the toolkit automatically handles this by resampling the track:
# The toolkit will resample your track to match --time-resolution python lorenzcycletoolkit.py data.nc -t -r --cdsapi --time-resolution 3 --trackfile inputs/track
Track temporal resolution differs from data
Problem: Track has higher temporal resolution than the data (e.g., 1-hour track vs 6-hour data).
Behavior: The toolkit checks this and raises an error if track resolution is higher than data resolution.
Error message:
❌ Track time step is higher than data time step. Please resample the track
to match the data time step.
Why this matters: You cannot analyze a system every hour if your atmospheric data is only available every 6 hours.
Solution: Resample your track file to match or exceed the data temporal resolution:
import pandas as pd
track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')
# If data is 6-hourly, resample track to 6-hourly
track_6h = track.resample('6H').first() # Takes the first value in each 6h window
# OR use interpolation if you want intermediate positions
track_6h = track.resample('6H').interpolate()
track_6h = track_6h.dropna()
track_6h.to_csv('inputs/track_6h', sep=';')
Units and Variable Specifications
Do variables need to match units specified in namelist?
Short answer: No, but the namelist must specify the actual units in your file.
How it works:
The namelist tells the toolkit what units are currently in your NetCDF file
The toolkit reads the data with those units
The toolkit automatically converts everything to SI units internally using MetPy
Example:
If your ERA5 file has temperature in Kelvin:
;standard_name;Variable;Units
Air Temperature;air_temperature;t;K
The toolkit will:
Read temperature as Kelvin
Keep it in Kelvin for calculations (Kelvin is already SI)
If your file had temperature in Celsius (unusual):
;standard_name;Variable;Units
Air Temperature;air_temperature;t;degC
The toolkit would:
Read temperature as Celsius
Convert to Kelvin internally
Perform all calculations in SI units
Important: The “Units” column in the namelist should match what’s actually in your file, not what you want. The toolkit handles conversions automatically.
Pressure levels must be in Pa or hPa?
Answer: Either is fine! The toolkit automatically converts to Pascals (Pa) internally.
Example from the code:
# The toolkit does this automatically:
levels_Pa = (data[LevelIndexer] * units(str(data[LevelIndexer].units))).metpy.convert_units("Pa")
Your file can have:
pressure_levelin hPa: [1000, 925, 850, 700, 500, …]pressure_levelin Pa: [100000, 92500, 85000, 70000, 50000, …]
Both will work correctly. Just make sure the units attribute in your NetCDF file is set correctly.
Why does my analysis fail with “units” error?
Problem: Error mentioning units or unit conversion.
Common causes:
Missing units attribute in NetCDF file:
import xarray as xr # Check if variables have units ds = xr.open_dataset('your_file.nc') print(ds['t'].attrs) # Should show 'units': 'K' print(ds['pressure_level'].attrs) # Should show 'units': 'hPa' or 'Pa'
Solution: Add units to your NetCDF file:
ds['t'].attrs['units'] = 'K' ds['pressure_level'].attrs['units'] = 'hPa' ds.to_netcdf('your_file_fixed.nc')
Non-standard unit names: Use standard CF conventions:
Temperature:
KordegC(notKelvinorcelsius)Pressure:
PaorhPa(notmbormillibars)Winds:
m/sorm s-1(notm/sec)
Geopotential vs Geopotential Height:
Geopotential: units should be
m**2 s**-2orm2 s-2Geopotential Height: units should be
m
The toolkit handles both, but you must specify correctly in your namelist.
Data Format and Loading Issues
“Could not open file” error
Error message:
❌ Could not open file. Check if path, namelist file, and file format (.nc) are correct.
Checklist:
File path is correct:
ls -lh path/to/your/file.nc # Verify file exists
File is valid NetCDF:
ncdump -h your_file.nc # Should show header without errors
File is not corrupted:
import xarray as xr ds = xr.open_dataset('your_file.nc') # Should open without errors
You have read permissions:
chmod 644 your_file.nc # Add read permissions if needed
“Namelist does not match the data” error
Error message:
❌ The variable list does not match the data. Check if the 'namelist' text file is correct.
Followed by a detailed list of available coordinates and variables in your dataset.
Solution process:
Check the error output - it shows all coordinates and variables in your file:
📊 DATASET INFORMATION ====================================================================== 🗺️ Available Coordinates: • latitude (degrees_north) - latitude • longitude (degrees_east) - longitude • level (hPa) - pressure_level • time - valid_time 📋 Available Variables: • t (K) - Temperature [air_temperature] • u (m/s) - U component of wind [eastward_wind] • v (m/s) - V component of wind [northward_wind] • w (Pa/s) - Vertical velocity [lagrangian_tendency_of_air_pressure] • z (m**2/s**2) - Geopotential [geopotential]
Update your namelist to use the actual variable names:
;standard_name;Variable;Units Air Temperature;air_temperature;t;K Geopotential;geopotential;z;m**2/s**2 Omega Velocity;omega;w;Pa/s Eastward Wind Component;eastward_wind;u;m/s Northward Wind Component;northward_wind;v;m/s Longitude;;longitude Latitude;;latitude Time;;time Vertical Level;;level
Use a preset if available:
# For ERA5 data cp inputs/namelist_ERA5-cdsapi inputs/namelist # For NCEP-R2 data cp inputs/namelist_NCEP-R2 inputs/namelist
Memory and Performance Issues
“MemoryError” or system becomes unresponsive
Problem: Your dataset is too large for available RAM.
Immediate solutions:
Reduce spatial domain: Pre-process your data to a smaller region:
import xarray as xr ds = xr.open_dataset('large_file.nc') # Subset to smaller region ds_subset = ds.sel( latitude=slice(-50, -10), longitude=slice(-70, -30) ) ds_subset.to_netcdf('smaller_file.nc')
Reduce temporal coverage: Analyze shorter periods:
ds_subset = ds.sel(time=slice('2020-01-01', '2020-01-07'))
Reduce vertical levels: Keep only tropospheric levels:
ds_subset = ds.sel(level=slice(1000, 100)) # 1000 to 100 hPa
Increase temporal resolution: Use 6-hour instead of 3-hour data:
ds_subset = ds.sel(time=ds.time.dt.hour.isin([0, 6, 12, 18]))
Remove unnecessary variables:
# Keep only required variables required_vars = ['t', 'u', 'v', 'w', 'z', 'latitude', 'longitude', 'level', 'time'] ds_subset = ds[required_vars]
See also: Usage section on “System Requirements and Performance”
Analysis takes too long
Problem: Processing is slower than expected.
Optimization tips:
Use appropriate temporal resolution: 6-hour data processes faster than 1-hour
Reduce domain size: Analyze only the region of interest
Disable plots for initial runs: Remove
-pflag during testingUse fixed framework instead of interactive:
-fis much faster than-cCheck your data preprocessing: Pre-subset data before running the toolkit
Configuration Issues
“No such file or directory: ‘inputs/namelist’”
Problem: You didn’t create the namelist file.
Solution: Copy one of the presets:
# For ERA5 data
cp inputs/namelist_ERA5-cdsapi inputs/namelist
# For NCEP Reanalysis 2
cp inputs/namelist_NCEP-R2 inputs/namelist
# For NCEP Reanalysis 1
cp inputs/namelist_NCEP-R1 inputs/namelist
See also: Configuration for detailed instructions.
“No such file or directory: ‘inputs/box_limits’” (fixed framework)
Problem: Using -f flag but haven’t created box_limits file.
Solution: Create the file with your domain bounds:
cat > inputs/box_limits << EOF
min_lon;-60
max_lon;-30
min_lat;-45
max_lat;-20
EOF
“No such file or directory: ‘inputs/track’” (moving framework)
Problem: Using -t flag but haven’t specified a track file.
Solution: Either:
Create a track file at
inputs/trackUse the –trackfile flag to specify a different path:
python lorenzcycletoolkit.py data.nc -r -t --trackfile path/to/your/track.csv
CDS API Issues
CDS API authentication fails
Error: Connection or authentication errors when using --cdsapi.
Solution checklist:
Check credentials file exists:
cat ~/.cdsapircVerify format:
url: https://cds.climate.copernicus.eu/api key: YOUR-UID:YOUR-API-KEY
Verify credentials are correct: Log in to https://cds.climate.copernicus.eu/user and check your UID and API key
Check permissions:
chmod 600 ~/.cdsapirc
Accept Terms and Conditions: You must accept the Copernicus terms at https://cds.climate.copernicus.eu/ before API access works
CDS API download is very slow
Problem: Downloads take hours or timeout.
Causes and solutions:
CDS servers are busy: Try during off-peak hours (evenings/weekends in Europe)
Domain is too large: Reduce spatial coverage or temporal resolution
Requesting too many levels: Consider requesting only tropospheric levels
Use coarser temporal resolution:
# Use 6-hour instead of 3-hour python lorenzcycletoolkit.py file.nc -t -r --cdsapi --time-resolution 6
Still Having Issues?
If your problem isn’t covered here:
Enable verbose logging to see detailed information:
python lorenzcycletoolkit.py your_file.nc -r -f -v
Check the log file in
LEC_Results/*/log.*for detailed error messagesReview the documentation:
Configuration - Setup requirements
Usage - Command-line options and data requirements
Examples and Tutorials - Step-by-step examples
Seek support:
GitHub Issues: https://github.com/daniloceano/LorenzCycleToolkit/issues
Email: danilo.oceano@gmail.com
When reporting issues, please include:
Full error message and traceback
Your command line
Relevant parts of the log file
Description of your input data (source, resolution, coverage)
LorenzCycleToolkit version: check with
git describe --tags