Skip to content

Commit 0397cac

Browse files
authored
Merge pull request #7 from VortexFDC/chapter_2_and_3
Publish chapter 2 & 3: read_txt & merge
2 parents 6bef97e + 4dcf117 commit 0397cac

10 files changed

+6513
-2844
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -160,3 +160,4 @@ cython_debug/
160160
# and can be added to the global gitignore or merged into this file. For a more nuclear
161161
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
162162
#.idea/
163+
.vscode/extensions.json

README.md

+6
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,12 @@ To execute the examples, you can run the following commands from the terminal or
7777
- [Chapter 1](notebooks/example_1_read_netcdf.ipynb)
7878
Read netcdf files with the xarray libraries. You will open and make basic operations. A quick overview of the data if done using pandas libraries.
7979

80+
- [Chapter 2](notebooks/example_2_read_txt.ipynb)
81+
Read txt files with the pandas libraries and create custom functions to some utilities like parsing txt header metadata and incorporate into the data object in xarray.
82+
83+
- [Chapter 3](notebooks/example_3_merge.ipynb)
84+
Merge two datasets. We will use the data from previous chapters. We will be merging the synthetic data from the model and the measurements.
85+
8086

8187
## 7. License [](#7-license)
8288

README_about_datasets.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -45,4 +45,4 @@ We are also using [Vortex f.d.c](http://www.vortexfdc.com) simulations. <br />
4545
<br /><br />
4646

4747

48-
<div align="center"><img src="images/logo_VORTEX.png" width="200px"> </center>
48+
<div align="center"><img src="images/logo_VORTEX.png" width="200px"> </center>

examples/example_2_read_txt.py

+198
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
# =============================================================================
2+
# Authors: Oriol L & Arnau T
3+
# Company: Vortex F.d.C.
4+
# Year: 2024
5+
# =============================================================================
6+
7+
"""
8+
Overview:
9+
---------
10+
This script demonstrates the process of reading various types of meteorological data files.
11+
The script uses functions to load and manipulate data from four distinct file formats:
12+
13+
1. Vortex Text Series - Text file with multiple columns and a header.
14+
2. Vortex remodeling - txt: A LT extrapolation combining measurements and vortex time series.
15+
16+
Data Storage:
17+
------------
18+
The acquired data is stored in two data structures for comparison and analysis:
19+
- Pandas DataFrame
20+
21+
Objective:
22+
----------
23+
- To understand the variance in data storage when using Pandas.
24+
- Utilize the 'describe' , head and other methods from Pandas for a quick overview of the dataset.
25+
"""
26+
27+
# =============================================================================
28+
# 1. Import Libraries
29+
# =============================================================================
30+
31+
from typing import Dict
32+
from example_2_read_txt_functions import *
33+
from example_2_read_txt_functions import _get_coordinates_vortex_header
34+
35+
# =============================================================================
36+
# 2. Define Paths and Site
37+
# =============================================================================
38+
39+
SITE = 'froya'
40+
pwd = os.getcwd()
41+
base_path = str(os.path.join(pwd, '../data'))
42+
43+
print()
44+
measurements_netcdf = os.path.join(base_path, f'{SITE}/measurements/obs.nc')
45+
vortex_netcdf = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.nc')
46+
47+
vortex_txt = os.path.join(base_path, f'{SITE}/vortex/SERIE/vortex.serie.era5.utc0.100m.txt')
48+
measurements_txt = os.path.join(base_path, f'{SITE}/measurements/obs.txt')
49+
50+
# Print filenames
51+
print('Measurements txt: ', measurements_txt)
52+
print('Vortex txt: ', vortex_txt)
53+
54+
print()
55+
print('#'*26, 'Vortex f.d.c. 2024', '#'*26)
56+
print()
57+
58+
# =============================================================================
59+
# 3. Read Vortex Text Series Functions
60+
# =============================================================================
61+
62+
# Read Text Series
63+
64+
# Call read_txt_to_pandas with particular options for file vortex_txt
65+
# `vortex_txt` format is like this:
66+
67+
# Lat=52.16632 Lon=14.12259 Hub-Height=100 Timezone=00.0 ASL-Height(avg. 3km-grid)=68 (file requested on 2023-09-28 10:30:31)
68+
# VORTEX (www.vortexfdc.com) - Computed at 3km resolution based on ERA5 data (designed for correlation purposes)
69+
#
70+
# YYYYMMDD HHMM M(m/s) D(deg) T(C) De(k/m3) PRE(hPa) RiNumber RH(%) RMOL(1/m)
71+
# 20030101 0000 7.5 133 -9.2 1.32 1000.2 0.26 80.3 0.0081
72+
# 20030101 0100 7.4 136 -10.0 1.32 999.8 0.25 82.1 0.0059
73+
74+
def read_vortex_serie(filename: str = "vortex.txt",
75+
vars_new_names: Dict = None) -> xr.Dataset:
76+
"""
77+
Read typical vortex time series from SERIES product and return
78+
an xarray.Dataset
79+
80+
Parameters
81+
----------
82+
vars_new_names: Dict
83+
the dictionary with the old names to new names
84+
85+
filename: str
86+
just the filename is enough
87+
88+
Returns
89+
-------
90+
ds: xarray.Dataset
91+
Dataset
92+
93+
Examples
94+
--------
95+
Lat=52.90466 Lon=14.76794 Hub-Height=130 Timezone=00.0 ASL-Height(avg. 3km-grid)=73 (file requested on 2022-10-17 11:34:05)
96+
VORTEX (www.vortex.es) - Computed at 3km resolution based on ERA5 data (designed for correlation purposes)
97+
YYYYMMDD HHMM M(m/s) D(deg) T(C) De(k/m3) PRE(hPa) RiNumber RH(%) RMOL(1/m)
98+
19910101 0000 8.5 175 2.1 1.25 988.1 0.56 91.1 0.
99+
100+
"""
101+
patterns = {'Lat=': 'lat',
102+
'Lon=': 'lon',
103+
'Timezone=': 'utc',
104+
'Hub-Height=': 'lev'}
105+
metadata = _get_coordinates_vortex_header(filename, patterns, line=0)
106+
data = read_txt_to_pandas(filename, utc=metadata['utc'],
107+
skiprows=3, header=0, names=None)
108+
__ds = convert_to_xarray(data, coords=metadata).squeeze()
109+
110+
if vars_new_names is None:
111+
vars_new_names = {'M(m/s)': 'M',
112+
'D(deg)': 'Dir',
113+
'T(C)': 'T',
114+
'De(k/m3)': 'D',
115+
'PRE(hPa)': 'P',
116+
'RiNumber': 'RI',
117+
'RH(%)': 'RH',
118+
'RMOL(1/m)': 'RMOL'}
119+
__ds = rename_vars(__ds, vars_new_names)
120+
121+
__ds = add_attrs_vars(__ds)
122+
return __ds
123+
124+
ds_vortex = read_vortex_serie(vortex_txt)
125+
print(ds_vortex)
126+
print()
127+
128+
df_vortex = ds_vortex.to_dataframe() # convert to dataframe
129+
130+
# Quickly inspect with head() and describe() methods
131+
132+
print('Vortex SERIES:\n' ,df_vortex[['M', 'Dir']].head())
133+
print()
134+
135+
# =============================================================================
136+
# 4. Read Measurements Txt
137+
# =============================================================================
138+
139+
def read_vortex_obs_to_dataframe(infile: str,
140+
with_sd: bool = False,
141+
out_dir_name: str = 'Dir',
142+
**kwargs) -> pd.DataFrame:
143+
"""
144+
Read a txt file with flexible options as a pandas DataFrame.
145+
146+
Parameters
147+
----------
148+
infile: str
149+
txt file. by default, no header, columns YYYYMMDD HHMM M D
150+
151+
with_sd: bool
152+
If True, an 'SD' column is appended
153+
out_dir_name: str
154+
Wind direction labeled which will appear in the return dataframe
155+
156+
Returns
157+
-------
158+
df: pd.DataFrame
159+
Dataframe
160+
161+
Examples
162+
--------
163+
>>> print("The default files read by this function are YYYYMMDD HHMM M D:")
164+
20050619 0000 6.2 331 1.1
165+
20050619 0010 6.8 347 0.9
166+
20050619 0020 7.3 343 1.2
167+
168+
"""
169+
170+
columns = ['YYYYMMDD', 'HHMM', 'M', out_dir_name]
171+
172+
if with_sd:
173+
columns.append('SD')
174+
175+
readcsv_kwargs = {
176+
'skiprows': 0,
177+
'header': None,
178+
'names': columns,
179+
}
180+
readcsv_kwargs.update(kwargs)
181+
182+
df: pd.DataFrame = read_txt_to_pandas(infile, **readcsv_kwargs)
183+
return df
184+
185+
df_obs = read_vortex_obs_to_dataframe(measurements_txt)
186+
ds_obs = convert_to_xarray(df_obs)
187+
188+
print('Measurements:\n', df_obs.head())
189+
print()
190+
191+
# =============================================================================
192+
# 5. Now we can compare statistics
193+
# =============================================================================
194+
195+
print('Vortex SERIES Statistics:\n', df_vortex[['M', 'Dir']].describe().round(2))
196+
print()
197+
print('Measurements Statistics:\n', df_obs.describe().round(2))
198+
print()

0 commit comments

Comments
 (0)