Troubleshooting and FAQ ======================= This page addresses common issues and frequently asked questions when using the LorenzCycleToolkit. Track File Issues ----------------- Track file has more timesteps than input data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Your track file extends beyond the temporal coverage of your NetCDF data file. **Example**: .. code-block:: text Track file: 2020-01-01 00:00 to 2020-01-10 23:00 Input data: 2020-01-01 00:00 to 2020-01-08 18:00 **Error message**: .. code-block:: text ❌ Track final timestamp (2020-01-10 23:00:00) is later than data final timestamp (2020-01-08 18:00:00). Please adjust the track file or re-download the data. **Solutions**: 1. **Trim your track file** to match the data coverage: .. code-block:: python import pandas as pd # Load track file track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time') # Trim to match data period track_trimmed = track['2020-01-01':'2020-01-08 18:00'] # Save trimmed track track_trimmed.to_csv('inputs/track_trimmed', sep=';') 2. **Download additional data** to cover the full track period (if using CDS API): .. code-block:: bash python lorenzcycletoolkit.py extended_data.nc -t -r --cdsapi --trackfile inputs/track 3. **Use a pre-downloaded file** that covers the full period Track file has fewer timesteps than input data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Your track file covers a shorter period than your input data. **Example**: .. code-block:: text Track file: 2020-01-03 00:00 to 2020-01-05 18:00 Input data: 2020-01-01 00:00 to 2020-01-10 23:00 **Behavior**: This is **perfectly fine**! The toolkit will automatically use only the timesteps from the input data that match the track file period. **What happens**: .. code-block:: python # The toolkit automatically does this: data = data.sel(time=track.index.values) Your analysis will run from 2020-01-03 00:00 to 2020-01-05 18:00, using only the relevant portion of the input data. **Use case**: This is useful when you have a large dataset but only want to analyze a specific event or time period. Track file has timesteps not present in data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Some track timesteps don't match any data timesteps (e.g., track has 1-hour intervals but data has 3-hour intervals). **Example**: .. code-block:: text Track timesteps: 00:00, 01:00, 02:00, 03:00, 04:00, ... Data timesteps: 00:00, 03:00, 06:00, 09:00, ... **Error message**: .. code-block:: text KeyError: 'Some timesteps from track not found in data' **Solutions**: 1. **Resample your track file** to match data temporal resolution: .. code-block:: python import pandas as pd # Load track track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time') # Resample to 3-hour intervals (matching data) track_resampled = track.resample('3H').first() # or .interpolate() for interpolation # Remove any NaN rows track_resampled = track_resampled.dropna() # Save resampled track track_resampled.to_csv('inputs/track_resampled', sep=';') 2. **If using CDS API**, the toolkit automatically handles this by resampling the track: .. code-block:: bash # The toolkit will resample your track to match --time-resolution python lorenzcycletoolkit.py data.nc -t -r --cdsapi --time-resolution 3 --trackfile inputs/track Track temporal resolution differs from data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Track has higher temporal resolution than the data (e.g., 1-hour track vs 6-hour data). **Behavior**: The toolkit checks this and raises an error if track resolution is higher than data resolution. **Error message**: .. code-block:: text ❌ Track time step is higher than data time step. Please resample the track to match the data time step. **Why this matters**: You cannot analyze a system every hour if your atmospheric data is only available every 6 hours. **Solution**: Resample your track file to match or exceed the data temporal resolution: .. code-block:: python import pandas as pd track = pd.read_csv('inputs/track', sep=';', parse_dates=['time'], index_col='time') # If data is 6-hourly, resample track to 6-hourly track_6h = track.resample('6H').first() # Takes the first value in each 6h window # OR use interpolation if you want intermediate positions track_6h = track.resample('6H').interpolate() track_6h = track_6h.dropna() track_6h.to_csv('inputs/track_6h', sep=';') Units and Variable Specifications ---------------------------------- Do variables need to match units specified in namelist? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Short answer**: No, but the namelist must specify the **actual units in your file**. **How it works**: 1. The namelist tells the toolkit what units are **currently** in your NetCDF file 2. The toolkit reads the data with those units 3. The toolkit **automatically converts** everything to SI units internally using MetPy **Example**: If your ERA5 file has temperature in Kelvin: .. code-block:: text ;standard_name;Variable;Units Air Temperature;air_temperature;t;K The toolkit will: - Read temperature as Kelvin - Keep it in Kelvin for calculations (Kelvin is already SI) If your file had temperature in Celsius (unusual): .. code-block:: text ;standard_name;Variable;Units Air Temperature;air_temperature;t;degC The toolkit would: - Read temperature as Celsius - Convert to Kelvin internally - Perform all calculations in SI units **Important**: The "Units" column in the namelist should match what's **actually in your file**, not what you want. The toolkit handles conversions automatically. Pressure levels must be in Pa or hPa? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Answer**: Either is fine! The toolkit automatically converts to Pascals (Pa) internally. **Example from the code**: .. code-block:: python # The toolkit does this automatically: levels_Pa = (data[LevelIndexer] * units(str(data[LevelIndexer].units))).metpy.convert_units("Pa") Your file can have: - ``pressure_level`` in hPa: [1000, 925, 850, 700, 500, ...] - ``pressure_level`` in Pa: [100000, 92500, 85000, 70000, 50000, ...] Both will work correctly. Just make sure the units attribute in your NetCDF file is set correctly. Why does my analysis fail with "units" error? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Error mentioning units or unit conversion. **Common causes**: 1. **Missing units attribute in NetCDF file**: .. code-block:: python import xarray as xr # Check if variables have units ds = xr.open_dataset('your_file.nc') print(ds['t'].attrs) # Should show 'units': 'K' print(ds['pressure_level'].attrs) # Should show 'units': 'hPa' or 'Pa' **Solution**: Add units to your NetCDF file: .. code-block:: python ds['t'].attrs['units'] = 'K' ds['pressure_level'].attrs['units'] = 'hPa' ds.to_netcdf('your_file_fixed.nc') 2. **Non-standard unit names**: Use standard CF conventions: - Temperature: ``K`` or ``degC`` (not ``Kelvin`` or ``celsius``) - Pressure: ``Pa`` or ``hPa`` (not ``mb`` or ``millibars``) - Winds: ``m/s`` or ``m s-1`` (not ``m/sec``) 3. **Geopotential vs Geopotential Height**: - Geopotential: units should be ``m**2 s**-2`` or ``m2 s-2`` - Geopotential Height: units should be ``m`` The toolkit handles both, but you must specify correctly in your namelist. Data Format and Loading Issues ------------------------------- "Could not open file" error ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Error message**: .. code-block:: text ❌ Could not open file. Check if path, namelist file, and file format (.nc) are correct. **Checklist**: 1. **File path is correct**: .. code-block:: bash ls -lh path/to/your/file.nc # Verify file exists 2. **File is valid NetCDF**: .. code-block:: bash ncdump -h your_file.nc # Should show header without errors 3. **File is not corrupted**: .. code-block:: python import xarray as xr ds = xr.open_dataset('your_file.nc') # Should open without errors 4. **You have read permissions**: .. code-block:: bash chmod 644 your_file.nc # Add read permissions if needed "Namelist does not match the data" error ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Error message**: .. code-block:: text ❌ The variable list does not match the data. Check if the 'namelist' text file is correct. Followed by a detailed list of available coordinates and variables in your dataset. **Solution process**: 1. **Check the error output** - it shows all coordinates and variables in your file: .. code-block:: text πŸ“Š DATASET INFORMATION ====================================================================== πŸ—ΊοΈ Available Coordinates: β€’ latitude (degrees_north) - latitude β€’ longitude (degrees_east) - longitude β€’ level (hPa) - pressure_level β€’ time - valid_time πŸ“‹ Available Variables: β€’ t (K) - Temperature [air_temperature] β€’ u (m/s) - U component of wind [eastward_wind] β€’ v (m/s) - V component of wind [northward_wind] β€’ w (Pa/s) - Vertical velocity [lagrangian_tendency_of_air_pressure] β€’ z (m**2/s**2) - Geopotential [geopotential] 2. **Update your namelist** to use the actual variable names: .. code-block:: text ;standard_name;Variable;Units Air Temperature;air_temperature;t;K Geopotential;geopotential;z;m**2/s**2 Omega Velocity;omega;w;Pa/s Eastward Wind Component;eastward_wind;u;m/s Northward Wind Component;northward_wind;v;m/s Longitude;;longitude Latitude;;latitude Time;;time Vertical Level;;level 3. **Use a preset if available**: .. code-block:: bash # For ERA5 data cp inputs/namelist_ERA5-cdsapi inputs/namelist # For NCEP-R2 data cp inputs/namelist_NCEP-R2 inputs/namelist Memory and Performance Issues ------------------------------ "MemoryError" or system becomes unresponsive ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Your dataset is too large for available RAM. **Immediate solutions**: 1. **Reduce spatial domain**: Pre-process your data to a smaller region: .. code-block:: python import xarray as xr ds = xr.open_dataset('large_file.nc') # Subset to smaller region ds_subset = ds.sel( latitude=slice(-50, -10), longitude=slice(-70, -30) ) ds_subset.to_netcdf('smaller_file.nc') 2. **Reduce temporal coverage**: Analyze shorter periods: .. code-block:: python ds_subset = ds.sel(time=slice('2020-01-01', '2020-01-07')) 3. **Reduce vertical levels**: Keep only tropospheric levels: .. code-block:: python ds_subset = ds.sel(level=slice(1000, 100)) # 1000 to 100 hPa 4. **Increase temporal resolution**: Use 6-hour instead of 3-hour data: .. code-block:: python ds_subset = ds.sel(time=ds.time.dt.hour.isin([0, 6, 12, 18])) 5. **Remove unnecessary variables**: .. code-block:: python # Keep only required variables required_vars = ['t', 'u', 'v', 'w', 'z', 'latitude', 'longitude', 'level', 'time'] ds_subset = ds[required_vars] **See also**: :doc:`usage` section on "System Requirements and Performance" Analysis takes too long ~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Processing is slower than expected. **Optimization tips**: 1. **Use appropriate temporal resolution**: 6-hour data processes faster than 1-hour 2. **Reduce domain size**: Analyze only the region of interest 3. **Disable plots for initial runs**: Remove ``-p`` flag during testing 4. **Use fixed framework instead of interactive**: ``-f`` is much faster than ``-c`` 5. **Check your data preprocessing**: Pre-subset data before running the toolkit Configuration Issues -------------------- "No such file or directory: 'inputs/namelist'" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: You didn't create the namelist file. **Solution**: Copy one of the presets: .. code-block:: bash # For ERA5 data cp inputs/namelist_ERA5-cdsapi inputs/namelist # For NCEP Reanalysis 2 cp inputs/namelist_NCEP-R2 inputs/namelist # For NCEP Reanalysis 1 cp inputs/namelist_NCEP-R1 inputs/namelist **See also**: :doc:`configuration` for detailed instructions. "No such file or directory: 'inputs/box_limits'" (fixed framework) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Using ``-f`` flag but haven't created box_limits file. **Solution**: Create the file with your domain bounds: .. code-block:: bash cat > inputs/box_limits << EOF min_lon;-60 max_lon;-30 min_lat;-45 max_lat;-20 EOF "No such file or directory: 'inputs/track'" (moving framework) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Using ``-t`` flag but haven't specified a track file. **Solution**: Either: 1. **Create a track file** at ``inputs/track`` 2. **Use the --trackfile flag** to specify a different path: .. code-block:: bash python lorenzcycletoolkit.py data.nc -r -t --trackfile path/to/your/track.csv CDS API Issues -------------- CDS API authentication fails ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Error**: Connection or authentication errors when using ``--cdsapi``. **Solution checklist**: 1. **Check credentials file exists**: .. code-block:: bash cat ~/.cdsapirc 2. **Verify format**: .. code-block:: text url: https://cds.climate.copernicus.eu/api key: YOUR-UID:YOUR-API-KEY 3. **Verify credentials are correct**: Log in to https://cds.climate.copernicus.eu/user and check your UID and API key 4. **Check permissions**: .. code-block:: bash chmod 600 ~/.cdsapirc 5. **Accept Terms and Conditions**: You must accept the Copernicus terms at https://cds.climate.copernicus.eu/ before API access works CDS API download is very slow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Problem**: Downloads take hours or timeout. **Causes and solutions**: 1. **CDS servers are busy**: Try during off-peak hours (evenings/weekends in Europe) 2. **Domain is too large**: Reduce spatial coverage or temporal resolution 3. **Requesting too many levels**: Consider requesting only tropospheric levels 4. **Use coarser temporal resolution**: .. code-block:: bash # Use 6-hour instead of 3-hour python lorenzcycletoolkit.py file.nc -t -r --cdsapi --time-resolution 6 Still Having Issues? -------------------- If your problem isn't covered here: 1. **Enable verbose logging** to see detailed information: .. code-block:: bash python lorenzcycletoolkit.py your_file.nc -r -f -v 2. **Check the log file** in ``LEC_Results/*/log.*`` for detailed error messages 3. **Review the documentation**: - :doc:`configuration` - Setup requirements - :doc:`usage` - Command-line options and data requirements - :doc:`examples_and_tutorials` - Step-by-step examples 4. **Seek support**: - GitHub Issues: https://github.com/daniloceano/LorenzCycleToolkit/issues - Email: danilo.oceano@gmail.com When reporting issues, please include: - Full error message and traceback - Your command line - Relevant parts of the log file - Description of your input data (source, resolution, coverage) - LorenzCycleToolkit version: check with ``git describe --tags``