Troubleshooting and FAQ

This page addresses common issues and frequently asked questions when using the LorenzCycleToolkit.

Track File Issues

Track file has more timesteps than input data

Problem: Your track file extends beyond the temporal coverage of your NetCDF data file.

Example:

Track file: 2020-01-01 00:00 to 2020-01-10 23:00
Input data:  2020-01-01 00:00 to 2020-01-08 18:00

Error message:

❌ Track final timestamp (2020-01-10 23:00:00) is later than data final timestamp
(2020-01-08 18:00:00). Please adjust the track file or re-download the data.

Solutions:

  1. Trim your track file to match the data coverage:

    import pandas as pd
    
    # Load track file
    track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')
    
    # Trim to match data period
    track_trimmed = track['2020-01-01':'2020-01-08 18:00']
    
    # Save trimmed track
    track_trimmed.to_csv('inputs/track_trimmed', sep=';')
    
  2. Download additional data to cover the full track period (if using CDS API):

    python lorenzcycletoolkit.py extended_data.nc -t -r --cdsapi --trackfile inputs/track
    
  3. Use a pre-downloaded file that covers the full period

Track file has fewer timesteps than input data

Problem: Your track file covers a shorter period than your input data.

Example:

Track file: 2020-01-03 00:00 to 2020-01-05 18:00
Input data:  2020-01-01 00:00 to 2020-01-10 23:00

Behavior: This is perfectly fine! The toolkit will automatically use only the timesteps from the input data that match the track file period.

What happens:

# The toolkit automatically does this:
data = data.sel(time=track.index.values)

Your analysis will run from 2020-01-03 00:00 to 2020-01-05 18:00, using only the relevant portion of the input data.

Use case: This is useful when you have a large dataset but only want to analyze a specific event or time period.

Track file has timesteps not present in data

Problem: Some track timesteps don’t match any data timesteps (e.g., track has 1-hour intervals but data has 3-hour intervals).

Example:

Track timesteps: 00:00, 01:00, 02:00, 03:00, 04:00, ...
Data timesteps:  00:00, 03:00, 06:00, 09:00, ...

Error message:

KeyError: 'Some timesteps from track not found in data'

Solutions:

  1. Resample your track file to match data temporal resolution:

    import pandas as pd
    
    # Load track
    track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')
    
    # Resample to 3-hour intervals (matching data)
    track_resampled = track.resample('3H').first()  # or .interpolate() for interpolation
    
    # Remove any NaN rows
    track_resampled = track_resampled.dropna()
    
    # Save resampled track
    track_resampled.to_csv('inputs/track_resampled', sep=';')
    
  2. If using CDS API, the toolkit automatically handles this by resampling the track:

    # The toolkit will resample your track to match --time-resolution
    python lorenzcycletoolkit.py data.nc -t -r --cdsapi --time-resolution 3 --trackfile inputs/track
    

Track temporal resolution differs from data

Problem: Track has higher temporal resolution than the data (e.g., 1-hour track vs 6-hour data).

Behavior: The toolkit checks this and raises an error if track resolution is higher than data resolution.

Error message:

❌ Track time step is higher than data time step. Please resample the track
to match the data time step.

Why this matters: You cannot analyze a system every hour if your atmospheric data is only available every 6 hours.

Solution: Resample your track file to match or exceed the data temporal resolution:

import pandas as pd

track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time')

# If data is 6-hourly, resample track to 6-hourly
track_6h = track.resample('6H').first()  # Takes the first value in each 6h window
# OR use interpolation if you want intermediate positions
track_6h = track.resample('6H').interpolate()

track_6h = track_6h.dropna()
track_6h.to_csv('inputs/track_6h', sep=';')

Units and Variable Specifications

Do variables need to match units specified in namelist?

Short answer: No, but the namelist must specify the actual units in your file.

How it works:

  1. The namelist tells the toolkit what units are currently in your NetCDF file

  2. The toolkit reads the data with those units

  3. The toolkit automatically converts everything to SI units internally using MetPy

Example:

If your ERA5 file has temperature in Kelvin:

;standard_name;Variable;Units
Air Temperature;air_temperature;t;K

The toolkit will:

  • Read temperature as Kelvin

  • Keep it in Kelvin for calculations (Kelvin is already SI)

If your file had temperature in Celsius (unusual):

;standard_name;Variable;Units
Air Temperature;air_temperature;t;degC

The toolkit would:

  • Read temperature as Celsius

  • Convert to Kelvin internally

  • Perform all calculations in SI units

Important: The “Units” column in the namelist should match what’s actually in your file, not what you want. The toolkit handles conversions automatically.

Pressure levels must be in Pa or hPa?

Answer: Either is fine! The toolkit automatically converts to Pascals (Pa) internally.

Example from the code:

# The toolkit does this automatically:
levels_Pa = (data[LevelIndexer] * units(str(data[LevelIndexer].units))).metpy.convert_units("Pa")

Your file can have:

  • pressure_level in hPa: [1000, 925, 850, 700, 500, …]

  • pressure_level in Pa: [100000, 92500, 85000, 70000, 50000, …]

Both will work correctly. Just make sure the units attribute in your NetCDF file is set correctly.

Why does my analysis fail with “units” error?

Problem: Error mentioning units or unit conversion.

Common causes:

  1. Missing units attribute in NetCDF file:

    import xarray as xr
    
    # Check if variables have units
    ds = xr.open_dataset('your_file.nc')
    print(ds['t'].attrs)  # Should show 'units': 'K'
    print(ds['pressure_level'].attrs)  # Should show 'units': 'hPa' or 'Pa'
    

    Solution: Add units to your NetCDF file:

    ds['t'].attrs['units'] = 'K'
    ds['pressure_level'].attrs['units'] = 'hPa'
    ds.to_netcdf('your_file_fixed.nc')
    
  2. Non-standard unit names: Use standard CF conventions:

    • Temperature: K or degC (not Kelvin or celsius)

    • Pressure: Pa or hPa (not mb or millibars)

    • Winds: m/s or m s-1 (not m/sec)

  3. Geopotential vs Geopotential Height:

    • Geopotential: units should be m**2 s**-2 or m2 s-2

    • Geopotential Height: units should be m

    The toolkit handles both, but you must specify correctly in your namelist.

Data Format and Loading Issues

“Could not open file” error

Error message:

❌ Could not open file. Check if path, namelist file, and file format (.nc) are correct.

Checklist:

  1. File path is correct:

    ls -lh path/to/your/file.nc  # Verify file exists
    
  2. File is valid NetCDF:

    ncdump -h your_file.nc  # Should show header without errors
    
  3. File is not corrupted:

    import xarray as xr
    ds = xr.open_dataset('your_file.nc')  # Should open without errors
    
  4. You have read permissions:

    chmod 644 your_file.nc  # Add read permissions if needed
    

“Namelist does not match the data” error

Error message:

❌ The variable list does not match the data. Check if the 'namelist' text file is correct.

Followed by a detailed list of available coordinates and variables in your dataset.

Solution process:

  1. Check the error output - it shows all coordinates and variables in your file:

    📊 DATASET INFORMATION
    ======================================================================
    
    🗺️  Available Coordinates:
       • latitude (degrees_north) - latitude
       • longitude (degrees_east) - longitude
       • level (hPa) - pressure_level
       • time - valid_time
    
    📋 Available Variables:
       • t (K) - Temperature [air_temperature]
       • u (m/s) - U component of wind [eastward_wind]
       • v (m/s) - V component of wind [northward_wind]
       • w (Pa/s) - Vertical velocity [lagrangian_tendency_of_air_pressure]
       • z (m**2/s**2) - Geopotential [geopotential]
    
  2. Update your namelist to use the actual variable names:

    ;standard_name;Variable;Units
    Air Temperature;air_temperature;t;K
    Geopotential;geopotential;z;m**2/s**2
    Omega Velocity;omega;w;Pa/s
    Eastward Wind Component;eastward_wind;u;m/s
    Northward Wind Component;northward_wind;v;m/s
    Longitude;;longitude
    Latitude;;latitude
    Time;;time
    Vertical Level;;level
    
  3. Use a preset if available:

    # For ERA5 data
    cp inputs/namelist_ERA5-cdsapi inputs/namelist
    
    # For NCEP-R2 data
    cp inputs/namelist_NCEP-R2 inputs/namelist
    

Memory and Performance Issues

“MemoryError” or system becomes unresponsive

Problem: Your dataset is too large for available RAM.

Immediate solutions:

  1. Reduce spatial domain: Pre-process your data to a smaller region:

    import xarray as xr
    
    ds = xr.open_dataset('large_file.nc')
    
    # Subset to smaller region
    ds_subset = ds.sel(
        latitude=slice(-50, -10),
        longitude=slice(-70, -30)
    )
    
    ds_subset.to_netcdf('smaller_file.nc')
    
  2. Reduce temporal coverage: Analyze shorter periods:

    ds_subset = ds.sel(time=slice('2020-01-01', '2020-01-07'))
    
  3. Reduce vertical levels: Keep only tropospheric levels:

    ds_subset = ds.sel(level=slice(1000, 100))  # 1000 to 100 hPa
    
  4. Increase temporal resolution: Use 6-hour instead of 3-hour data:

    ds_subset = ds.sel(time=ds.time.dt.hour.isin([0, 6, 12, 18]))
    
  5. Remove unnecessary variables:

    # Keep only required variables
    required_vars = ['t', 'u', 'v', 'w', 'z', 'latitude', 'longitude', 'level', 'time']
    ds_subset = ds[required_vars]
    

See also: Usage section on “System Requirements and Performance”

Analysis takes too long

Problem: Processing is slower than expected.

Optimization tips:

  1. Use appropriate temporal resolution: 6-hour data processes faster than 1-hour

  2. Reduce domain size: Analyze only the region of interest

  3. Disable plots for initial runs: Remove -p flag during testing

  4. Use fixed framework instead of interactive: -f is much faster than -c

  5. Check your data preprocessing: Pre-subset data before running the toolkit

Configuration Issues

“No such file or directory: ‘inputs/namelist’”

Problem: You didn’t create the namelist file.

Solution: Copy one of the presets:

# For ERA5 data
cp inputs/namelist_ERA5-cdsapi inputs/namelist

# For NCEP Reanalysis 2
cp inputs/namelist_NCEP-R2 inputs/namelist

# For NCEP Reanalysis 1
cp inputs/namelist_NCEP-R1 inputs/namelist

See also: Configuration for detailed instructions.

“No such file or directory: ‘inputs/box_limits’” (fixed framework)

Problem: Using -f flag but haven’t created box_limits file.

Solution: Create the file with your domain bounds:

cat > inputs/box_limits << EOF
min_lon;-60
max_lon;-30
min_lat;-45
max_lat;-20
EOF

“No such file or directory: ‘inputs/track’” (moving framework)

Problem: Using -t flag but haven’t specified a track file.

Solution: Either:

  1. Create a track file at inputs/track

  2. Use the –trackfile flag to specify a different path:

    python lorenzcycletoolkit.py data.nc -r -t --trackfile path/to/your/track.csv
    

CDS API Issues

CDS API authentication fails

Error: Connection or authentication errors when using --cdsapi.

Solution checklist:

  1. Check credentials file exists:

    cat ~/.cdsapirc
    
  2. Verify format:

    url: https://cds.climate.copernicus.eu/api
    key: YOUR-UID:YOUR-API-KEY
    
  3. Verify credentials are correct: Log in to https://cds.climate.copernicus.eu/user and check your UID and API key

  4. Check permissions:

    chmod 600 ~/.cdsapirc
    
  5. Accept Terms and Conditions: You must accept the Copernicus terms at https://cds.climate.copernicus.eu/ before API access works

CDS API download is very slow

Problem: Downloads take hours or timeout.

Causes and solutions:

  1. CDS servers are busy: Try during off-peak hours (evenings/weekends in Europe)

  2. Domain is too large: Reduce spatial coverage or temporal resolution

  3. Requesting too many levels: Consider requesting only tropospheric levels

  4. Use coarser temporal resolution:

    # Use 6-hour instead of 3-hour
    python lorenzcycletoolkit.py file.nc -t -r --cdsapi --time-resolution 6
    

Still Having Issues?

If your problem isn’t covered here:

  1. Enable verbose logging to see detailed information:

    python lorenzcycletoolkit.py your_file.nc -r -f -v
    
  2. Check the log file in LEC_Results/*/log.* for detailed error messages

  3. Review the documentation:

  4. Seek support:

When reporting issues, please include:

  • Full error message and traceback

  • Your command line

  • Relevant parts of the log file

  • Description of your input data (source, resolution, coverage)

  • LorenzCycleToolkit version: check with git describe --tags