Troubleshooting and FAQ

This page addresses common issues and frequently asked questions when using the LorenzCycleToolkit.

Track File Issues

Track file has more timesteps than input data

Problem: Your track file extends beyond the temporal coverage of your NetCDF data file.

Example:

Track file: 2020-01-01 00:00 to 2020-01-10 23:00
Input data:  2020-01-01 00:00 to 2020-01-08 18:00

Error message:

❌ Track final timestamp (2020-01-10 23:00:00) is later than data final timestamp
(2020-01-08 18:00:00). Please adjust the track file or re-download the data.

Solutions:

Trim your track file to match the data coverage:

import pandas as pd

# Load track file
track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')

# Trim to match data period
track_trimmed = track['2020-01-01':'2020-01-08 18:00']

# Save trimmed track
track_trimmed.to_csv('inputs/track_trimmed', sep=';')

Download additional data to cover the full track period (if using CDS API):

python lorenzcycletoolkit.py extended_data.nc -t -r --cdsapi --trackfile inputs/track

Use a pre-downloaded file that covers the full period

Track file has fewer timesteps than input data

Problem: Your track file covers a shorter period than your input data.

Example:

Track file: 2020-01-03 00:00 to 2020-01-05 18:00
Input data:  2020-01-01 00:00 to 2020-01-10 23:00

Behavior: This is perfectly fine! The toolkit will automatically use only the timesteps from the input data that match the track file period.

What happens:

# The toolkit automatically does this:
data = data.sel(time=track.index.values)

Your analysis will run from 2020-01-03 00:00 to 2020-01-05 18:00, using only the relevant portion of the input data.

Use case: This is useful when you have a large dataset but only want to analyze a specific event or time period.

Track file has timesteps not present in data

Problem: Some track timesteps don’t match any data timesteps (e.g., track has 1-hour intervals but data has 3-hour intervals).

Example:

Track timesteps: 00:00, 01:00, 02:00, 03:00, 04:00, ...
Data timesteps:  00:00, 03:00, 06:00, 09:00, ...

Error message:

KeyError: 'Some timesteps from track not found in data'

Solutions:

Resample your track file to match data temporal resolution:

import pandas as pd

# Load track
track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')

# Resample to 3-hour intervals (matching data)
track_resampled = track.resample('3H').first()  # or .interpolate() for interpolation

# Remove any NaN rows
track_resampled = track_resampled.dropna()

# Save resampled track
track_resampled.to_csv('inputs/track_resampled', sep=';')

If using CDS API, the toolkit automatically handles this by resampling the track:

# The toolkit will resample your track to match --time-resolution
python lorenzcycletoolkit.py data.nc -t -r --cdsapi --time-resolution 3 --trackfile inputs/track

Track temporal resolution differs from data

Problem: Track has higher temporal resolution than the data (e.g., 1-hour track vs 6-hour data).

Behavior: The toolkit checks this and raises an error if track resolution is higher than data resolution.

Error message:

❌ Track time step is higher than data time step. Please resample the track
to match the data time step.

Why this matters: You cannot analyze a system every hour if your atmospheric data is only available every 6 hours.

Solution: Resample your track file to match or exceed the data temporal resolution:

import pandas as pd

track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')

# If data is 6-hourly, resample track to 6-hourly
track_6h = track.resample('6H').first()  # Takes the first value in each 6h window
# OR use interpolation if you want intermediate positions
track_6h = track.resample('6H').interpolate()

track_6h = track_6h.dropna()
track_6h.to_csv('inputs/track_6h', sep=';')

Units and Variable Specifications

Do variables need to match units specified in namelist?

Short answer: No, but the namelist must specify the actual units in your file.

How it works:

The namelist tells the toolkit what units are currently in your NetCDF file
The toolkit reads the data with those units
The toolkit automatically converts everything to SI units internally using MetPy

Example:

If your ERA5 file has temperature in Kelvin:

;standard_name;Variable;Units
Air Temperature;air_temperature;t;K

The toolkit will:

Read temperature as Kelvin
Keep it in Kelvin for calculations (Kelvin is already SI)

If your file had temperature in Celsius (unusual):

;standard_name;Variable;Units
Air Temperature;air_temperature;t;degC

The toolkit would:

Read temperature as Celsius
Convert to Kelvin internally
Perform all calculations in SI units

Important: The “Units” column in the namelist should match what’s actually in your file, not what you want. The toolkit handles conversions automatically.

Pressure levels must be in Pa or hPa?

Answer: Either is fine! The toolkit automatically converts to Pascals (Pa) internally.

Example from the code:

# The toolkit does this automatically:
levels_Pa = (data[LevelIndexer] * units(str(data[LevelIndexer].units))).metpy.convert_units("Pa")

Your file can have:

pressure_level in hPa: [1000, 925, 850, 700, 500, …]
pressure_level in Pa: [100000, 92500, 85000, 70000, 50000, …]

Both will work correctly. Just make sure the units attribute in your NetCDF file is set correctly.

Why does my analysis fail with “units” error?

Problem: Error mentioning units or unit conversion.

Common causes:

Missing units attribute in NetCDF file:

import xarray as xr

# Check if variables have units
ds = xr.open_dataset('your_file.nc')
print(ds['t'].attrs)  # Should show 'units': 'K'
print(ds['pressure_level'].attrs)  # Should show 'units': 'hPa' or 'Pa'

Solution: Add units to your NetCDF file:

ds['t'].attrs['units'] = 'K'
ds['pressure_level'].attrs['units'] = 'hPa'
ds.to_netcdf('your_file_fixed.nc')

Non-standard unit names: Use standard CF conventions:
- Temperature: K or degC (not Kelvin or celsius)
- Pressure: Pa or hPa (not mb or millibars)
- Winds: m/s or m s-1 (not m/sec)
Geopotential vs Geopotential Height:
- Geopotential: units should be m**2 s**-2 or m2 s-2
- Geopotential Height: units should be m
The toolkit handles both, but you must specify correctly in your namelist.

Data Format and Loading Issues

“Could not open file” error

Error message:

❌ Could not open file. Check if path, namelist file, and file format (.nc) are correct.

Checklist:

File path is correct:

ls -lh path/to/your/file.nc  # Verify file exists

File is valid NetCDF:

ncdump -h your_file.nc  # Should show header without errors

File is not corrupted:

import xarray as xr
ds = xr.open_dataset('your_file.nc')  # Should open without errors

You have read permissions:

chmod 644 your_file.nc  # Add read permissions if needed

“Namelist does not match the data” error

Error message:

❌ The variable list does not match the data. Check if the 'namelist' text file is correct.

Followed by a detailed list of available coordinates and variables in your dataset.

Solution process:

Check the error output - it shows all coordinates and variables in your file:

📊 DATASET INFORMATION
======================================================================

🗺️  Available Coordinates:
   • latitude (degrees_north) - latitude
   • longitude (degrees_east) - longitude
   • level (hPa) - pressure_level
   • time - valid_time

📋 Available Variables:
   • t (K) - Temperature [air_temperature]
   • u (m/s) - U component of wind [eastward_wind]
   • v (m/s) - V component of wind [northward_wind]
   • w (Pa/s) - Vertical velocity [lagrangian_tendency_of_air_pressure]
   • z (m**2/s**2) - Geopotential [geopotential]

Update your namelist to use the actual variable names:

;standard_name;Variable;Units
Air Temperature;air_temperature;t;K
Geopotential;geopotential;z;m**2/s**2
Omega Velocity;omega;w;Pa/s
Eastward Wind Component;eastward_wind;u;m/s
Northward Wind Component;northward_wind;v;m/s
Longitude;;longitude
Latitude;;latitude
Time;;time
Vertical Level;;level

Use a preset if available:

# For ERA5 data
cp inputs/namelist_ERA5-cdsapi inputs/namelist

# For NCEP-R2 data
cp inputs/namelist_NCEP-R2 inputs/namelist

Memory and Performance Issues

“MemoryError” or system becomes unresponsive

Problem: Your dataset is too large for available RAM.

Immediate solutions:

Reduce spatial domain: Pre-process your data to a smaller region:

import xarray as xr

ds = xr.open_dataset('large_file.nc')

# Subset to smaller region
ds_subset = ds.sel(
    latitude=slice(-50, -10),
    longitude=slice(-70, -30)
)

ds_subset.to_netcdf('smaller_file.nc')

Reduce temporal coverage: Analyze shorter periods:

ds_subset = ds.sel(time=slice('2020-01-01', '2020-01-07'))

Reduce vertical levels: Keep only tropospheric levels:

ds_subset = ds.sel(level=slice(1000, 100))  # 1000 to 100 hPa

Increase temporal resolution: Use 6-hour instead of 3-hour data:

ds_subset = ds.sel(time=ds.time.dt.hour.isin([0, 6, 12, 18]))

Remove unnecessary variables:

# Keep only required variables
required_vars = ['t', 'u', 'v', 'w', 'z', 'latitude', 'longitude', 'level', 'time']
ds_subset = ds[required_vars]

See also: Usage section on “System Requirements and Performance”

Analysis takes too long

Problem: Processing is slower than expected.

Optimization tips:

Use appropriate temporal resolution: 6-hour data processes faster than 1-hour
Reduce domain size: Analyze only the region of interest
Disable plots for initial runs: Remove -p flag during testing
Use fixed framework instead of interactive: -f is much faster than -c
Check your data preprocessing: Pre-subset data before running the toolkit

Configuration Issues

“No such file or directory: ‘inputs/namelist’”

Problem: You didn’t create the namelist file.

Solution: Copy one of the presets:

# For ERA5 data
cp inputs/namelist_ERA5-cdsapi inputs/namelist

# For NCEP Reanalysis 2
cp inputs/namelist_NCEP-R2 inputs/namelist

# For NCEP Reanalysis 1
cp inputs/namelist_NCEP-R1 inputs/namelist

See also: Configuration for detailed instructions.

“No such file or directory: ‘inputs/box_limits’” (fixed framework)

Problem: Using -f flag but haven’t created box_limits file.

Solution: Create the file with your domain bounds:

cat > inputs/box_limits << EOF
min_lon;-60
max_lon;-30
min_lat;-45
max_lat;-20
EOF

“No such file or directory: ‘inputs/track’” (moving framework)

Problem: Using -t flag but haven’t specified a track file.

Solution: Either:

Create a track file at inputs/track

Use the –trackfile flag to specify a different path:

python lorenzcycletoolkit.py data.nc -r -t --trackfile path/to/your/track.csv

CDS API Issues

CDS API authentication fails

Error: Connection or authentication errors when using --cdsapi.

Solution checklist:

Check credentials file exists:
```
cat ~/.cdsapirc
```

Verify format:

url: https://cds.climate.copernicus.eu/api
key: YOUR-UID:YOUR-API-KEY

Verify credentials are correct: Log in to https://cds.climate.copernicus.eu/user and check your UID and API key
Check permissions:
```
chmod 600 ~/.cdsapirc
```
Accept Terms and Conditions: You must accept the Copernicus terms at https://cds.climate.copernicus.eu/ before API access works

CDS API download is very slow

Problem: Downloads take hours or timeout.

Causes and solutions:

CDS servers are busy: Try during off-peak hours (evenings/weekends in Europe)
Domain is too large: Reduce spatial coverage or temporal resolution
Requesting too many levels: Consider requesting only tropospheric levels

Use coarser temporal resolution:

# Use 6-hour instead of 3-hour
python lorenzcycletoolkit.py file.nc -t -r --cdsapi --time-resolution 6

Still Having Issues?

If your problem isn’t covered here:

Enable verbose logging to see detailed information:

python lorenzcycletoolkit.py your_file.nc -r -f -v

Check the log file in LEC_Results/*/log.* for detailed error messages
Review the documentation:
- Configuration - Setup requirements
- Usage - Command-line options and data requirements
- Examples and Tutorials - Step-by-step examples
Seek support:
- GitHub Issues: https://github.com/daniloceano/LorenzCycleToolkit/issues
- Email: danilo.oceano@gmail.com

When reporting issues, please include:

Full error message and traceback
Your command line
Relevant parts of the log file
Description of your input data (source, resolution, coverage)
LorenzCycleToolkit version: check with git describe --tags