-
Notifications
You must be signed in to change notification settings - Fork 328
Loop Computations
A loop computation computes members of a set according to a specific order. During the loop, we can perform a series of computations from simple ones, like accessing the current member and assigning values to a member, to complex ones, such as the inter-row computation, nested loop, and iterative computation. This essay introduces esProc, a convenient and fast tool of handling loop computations, through a set of sample programs. Looking ${article} for details.
Perform judgements by loop, adding a new member to an existing sequence each time.
【Example 1】Compare two files that have same number of rows to count the rows with same data. Below is part of the source data:
ID | Predicted_Y | Original_Y |
---|---|---|
10 | 0.012388464367608093 | 0.0 |
11 | 0.01519899123978988 | 0.0 |
13 | 0.0007920238885061248 | 0.0 |
19 | 0.0012656367468159102 | 0.0 |
21 | 0.009460545997473379 | 0.0 |
23 | 0.024176791871681664 | 0.0 |
… | … | … |
【SPL script】
A | B | C | |
---|---|---|---|
1 | =file("p_old.csv").import@ct() | / Read in the file first imported | |
2 | =file("p_new.csv").import@ct() | / Read in the second file imported | |
3 | for A1.len() | =cmp(A1(A3),A2(A3)) | / Compare the two files row by row by loop |
4 | =@ | B3 | |
5 | =B4.count(~==0) | / Count the rows having same data |
Below is A5’s result:
Value |
---|
11302 |
Loop through members of a set to compute each and assign value to it.
【Example 2】According to the following sales table, give a reward of sales amount’s 5% to each salesperson whose performance in 2014 rank the top 10%. Below is part of the source table:
OrderID | Customer | SellerId | OrderDate | Amount |
---|---|---|---|---|
10400 | EASTC | 1 | 2014/01/01 | 3063.0 |
10401 | HANAR | 1 | 2014/01/01 | 3868.6 |
10402 | ERNSH | 8 | 2014/01/02 | 2713.5 |
10403 | ERNSH | 4 | 2014/01/03 | 1005.9 |
10404 | MAGAA | 2 | 2014/01/03 | 1675.0 |
… | … | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =connect("db").query("select * from sales") | / Connect to the data source to read in sales table |
2 | =A1.select(year(OrderDate)==2014) | / Get data of 2014 |
3 | =A2.groups(SellerId;sum(Amount):Amount) | / Group A2 by seller and calculate the total sales amount in 2014 |
4 | =A3.sort@z(Amount).to(A3.len()*0.1) | / Sort A3 by sales amount in descending order and get records where the amount ranks top 10% |
5 | =A4.run(Amount*=1.05) | / Use A.run() function to loop through A4’s selected records and give a reward of the sale’s amount’s 5% to each seller |
Below is A5’ s result:
SellerId | Amount |
---|---|
4 | 150433.185 |
3 | 127878.04 |
1 | 102756.759 |
8 | 87965.346 |
After data is grouped, summarize specific columns to get different aggregates while performing inter-row computations.
【Example 3】According to the user payment details table below, calculate the monthly sum payable in 2014 for each user. Below is part of the source table:
ID | customID | name | amount_payable | due_date | amount_paid | pay_date |
---|---|---|---|---|---|---|
112101 | C013 | CA | 12800 | 2014-02-21 | 12800 | 2014-12-19 |
112102 | C013 | CA | 3500 | 2014-06-15 | 3500 | 2014-12-15 |
112103 | C013 | CA | 2600 | 2015-03-21 | 6900 | 2015-10-17 |
The target is to output the payable sum per month in the year of 2014. If no data exists for the current month, the payable sum will be the one in the previous month:
name | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CA | 12800 | 12800 | 12800 | 12800 | 16300 | 16300 | 16300 | 16300 | 16300 | 16300 | 16300 | |
… |
A | B | C | |
---|---|---|---|
1 | =file("Payment.txt").import@t().select(year(due_date)==2014) | / Import data of 2014 from the file | |
2 | =create(name,${12.().concat@c()}) | =A1.group(customID) | /A2: Generate an empty table of 12 months. A3: Group A1 by customer ID |
3 | for B2 | =12.(null) | / Generate null values for the 12 months |
4 | >A3.run(B3(month(due_date))= amount_payable) | / Set the payable sum for the corresponding month | |
5 | >B3.run(~+=~[-1]) | / Set null as payable sum of the previous month and calculate the cumulative sum month by month | |
6 | =A2.record(B2.name | B3) |
【SPL script】
Below is A2’s result:
name | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CA | 12800 | 12800 | 12800 | 12800 | 16300 | 16300 | 16300 | 16300 | 16300 | 16300 | 16300 | |
… |
Count the frequency of continuous rising of a specific column value during the loop.
【Example 4】According to the SSE Composite Index records below, get the closing price’s largest number of consecutive rising days in 2019. Below is the part of the source table:
Date | Open | Close | Amount |
---|---|---|---|
2019/12/31 | 3036.3858 | 3050.124 | 2.27E11 |
2019/12/30 | 2998.1689 | 3040.0239 | 2.67E11 |
2019/12/27 | 3006.8517 | 3005.0355 | 2.58E11 |
2019/12/26 | 2981.2485 | 3007.3546 | 1.96E11 |
2019/12/25 | 2980.4276 | 2981.8805 | 1.9E11 |
… | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =file("000001.csv").import@ct() | / Import the source data file |
2 | =A1.select(year(Date)==2019).sort(Date) | / Get records of 2019 and sort them by date |
3 | =n=0,A2.max(if(Close>Close[-1],n+=1,n=0)) | / Loop through closing prices to compare each one with the previous one and add 1 if the current price is higher, and then get the largest count |
Below is A3’s result:
Value |
---|
6 |
Use loop functions in a nested way.
【Example 5】The hundred fowls problem: If a rooster is worth five coins, a hen three coins, and three chicks together are worth one coin, how many roosters, hens, and chicks totaling 100 can be bought for 100 coins?
【SPL script】
A | B | |
---|---|---|
1 | =to(100/5) | / Number of roosters that can be bought |
2 | =to(100/3) | / Number of hens that can be bought |
3 | =33.(~*3) | / Number of chicks that can be bought |
4 | =create(Cock,Hen,Chick) | / Create an empty table to store the numbers of roosters, hens, and chicks |
5 | >A1.run(A2.run(A3.run(if(A1.~+A2.~+A3.~==100 "" A1.~*5+A2.~*3+A3.~/3==100,A4.insert(0,A1.~,A2.~,A3.~))))) | / Loop through the number of roosters, hens, and chicks respectively and insert the result records to A4’s table if the specific requirement is satisfied. The tilde sign is used to represent the current member |
Below is A4’s result:
Cock | Hen | Chick |
---|---|---|
4 | 18 | 78 |
8 | 11 | 81 |
12 | 4 | 84 |
Query a certain file by loop to generate desired information during which the loop number is displayed.
【Example 6】Query file 2 to output specific information according to key words in file 1.
【SPL script】
A | B | |
---|---|---|
1 | =file("file1.txt").read@n() | / Read in file 1 |
2 | =file("file2.txt").read@n() | / Read in file 2 |
3 | =A1.conj(("Q"+string(#)+"."+~) | A2.select(pos(~,A1.~)).(~.words()(1))) |
Below is A3’s result:
Member |
---|
Q1. like parks |
I |
Shelly |
Harry |
Q2. went out |
Shelly |
Q3. go out |
I |
Ben |
Harry |
Calculate the average within skewness intervals starting from specific positions by loop.
【Example 7】According to the stock exchange table below, list the average closing price within 20 days for each transaction date during 1.1 – 1.10 in the year of 2020. Below is part of the source table:
Date | Open | Close | Amount |
---|---|---|---|
2019/12/31 | 3036.3858 | 3050.124 | 2.27E11 |
2019/12/30 | 2998.1689 | 3040.0239 | 2.67E11 |
2019/12/27 | 3006.8517 | 3005.0355 | 2.58E11 |
2019/12/26 | 2981.2485 | 3007.3546 | 1.96E11 |
2019/12/25 | 2980.4276 | 2981.8805 | 1.9E11 |
… | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =connect("db") | /Connect to the data source |
2 | =A1.query("select Date, Close from Stock where Code='600036'order by Date") | / Select records of the targeted stock and sort them by date |
3 | =A2.pselect@a(Date>=date("2020/01/01") "" Date<=date("2020/01/10")) | / Use A.pselect() function to get the sequence numbers of records from 1.1-1.10 in 2020 |
4 | =A2(A3).derive(A2.calc(A3(#),avg(Close[-19:0])):ma20) | / Use calc() function to calculate and return the average closing price within 20 days for each of the specified ten days. Close[-19:0] gets closing prices from 19 days ago to the current date |
Below is A4’s result:
Date | Close | ma20 |
---|---|---|
2020/01/02 | 38.88 | 37.35 |
2020/01/03 | 39.4 | 37.50 |
2020/01/06 | 39.24 | 37.64 |
2020/01/07 | 39.15 | 37.79 |
2020/01/08 | 38.41 | 37.90 |
2020/01/09 | 38.9 | 38.03 |
2020/01/10 | 39.04 | 38.16 |
Perform iterative accumulation during the loop and then filtering according to the cumulative total.
【Example 8】According to the sales table below, count the days needed to bag 20 orders per month in the year of 2014. Below is part of the source table:
OrderID | Customer | SellerId | OrderDate | Amount |
---|---|---|---|---|
10400 | EASTC | 1 | 2014/01/01 | 3063.0 |
10401 | HANAR | 1 | 2014/01/01 | 3868.6 |
10402 | ERNSH | 8 | 2014/01/02 | 2713.5 |
10403 | ERNSH | 4 | 2014/01/03 | 1005.9 |
10404 | MAGAA | 2 | 2014/01/03 | 1675.0 |
… | … | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =connect("db").query("select * from sales") | / Connect to the data source to read in the sales table |
2 | =A1.select(year(OrderDate)==2014) | / Get records of 2014 |
3 | =A2.sort(OrderDate) | / Sort the selected records by order date |
4 | =A3.select(seq(month(OrderDate))==20) | / Use seq() function to get sequence numbers of orders in each month, and then select the record whose sequence number is 20 |
Below is A4’s result:
Month | Day |
---|---|
1 | 20 |
2 | 20 |
3 | 20 |
4 | 18 |
… | … |
Get the rank in each group by loop.
【Example 9】According to the employee table below, get the rank of each employee’s salary in their department. Below is part of the source table:
ID | NAME | DEPT | SALARY |
---|---|---|---|
1 | Rebecca | R"D | 7000 |
2 | Ashley | Finance | 11000 |
3 | Rachel | Sales | 9000 |
4 | Emily | HR | 7000 |
5 | Ashley | R"D | 16000 |
… | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =connect("db") .query("select * from Employee order by DEPT, SALARY DESC") | / Connect to the data source, read in the employee table, and sort it by department and salary |
2 | =A1.derive(rank(SALARY;DEPT):DeptRank) | / Use rank() function to number the ordered departments and salaries, and calculate department ranks |
Below is A2’s result:
ID | NAME | DEPT | SALARY | DeptRank |
---|---|---|---|---|
2 | Ashley | Finance | 11000 | 1 |
32 | Andrew | Finance | 11000 | 1 |
230 | Hannah | Finance | 10000 | 3 |
24 | Chloe | Finance | 10000 | 3 |
… | … | … | … | … |
Get the tight ranks for specific fields in each group by loop.
【Example 10】According to the scores table below, get the ranks of all subject scores for the student whose ID is 8 in class one. Below is part of the source table:
CLASS | STUDENTID | SUBJECT | SCORE |
---|---|---|---|
Class one | 1 | English | 84 |
Class one | 1 | Math | 77 |
Class one | 1 | PE | 69 |
Class one | 2 | English | 81 |
Class one | 2 | Math | 80 |
… | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =connect("db") .query("select * from SCORES where CLASS='Class one'order by SUBJECT, SCORE DESC") | / Connect to the data source, read in the scores table, and sort it by subject and score |
2 | =A1.derive(ranki(SCORE;SUBJECT):Rank) | / Use ranki() function to number the ordered departments and salaries, and calculate the tight rank for each score |
3 | =A2.select(STUDENTID==8) | / Get information of student whose ID is 8 |
4 | =create(${A3.(SUBJECT).concat@c()}).record(A3.(Rank)) | / Join up tight ranks of scores of all subjects according to A3’ information |
Below is A4’s result:
English | Math | PE |
---|---|---|
10 | 4 | 14 |
Calculate iterative sums by loop.
【Example 11】According to the SSE Composite Index records below, calculate the cumulative amount total for each transaction date in 2019. Below is part of the source table:
Date | Open | Close | Amount |
---|---|---|---|
2019/12/31 | 3036.3858 | 3050.124 | 2.27E11 |
2019/12/30 | 2998.1689 | 3040.0239 | 2.67E11 |
2019/12/27 | 3006.8517 | 3005.0355 | 2.58E11 |
2019/12/26 | 2981.2485 | 3007.3546 | 1.96E11 |
2019/12/25 | 2980.4276 | 2981.8805 | 1.9E11 |
… | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =file("000001.csv").import@ct() | / Import the source file |
2 | =A1.select(year(Date)==2019).sort(Date) | / Select records of 2019 and sort them by date |
3 | =A2.derive(cum(Amount):CUM) | / Use cum() function to calculate the cumulative amount total |
Below is A3’s result:
Date | Open | Close | Amount | CUM |
---|---|---|---|---|
2019/01/02 | 2497.8805 | 2465.291 | 9.759E10 | 9.759E10 |
2019/01/03 | 2461.7829 | 2464.3628 | 1.07E11 | 2.046E11 |
2019/01/04 | 2446.0193 | 2514.8682 | 1.39E11 | 3.436E11 |
2019/01/07 | 2528.6987 | 2533.0887 | 1.46E11 | 4.896E11 |
2019/01/08 | 2530.3001 | 2526.4622 | 1.23E11 | 6.126E11 |
… | … | … | … | … |
Users define the computing expression and the desired termination condition for the iteration computation performed within the loop.
【Example 12】According to the sales table below, get the date when the sales target of 150,000 is achieved for the first quarter. Below is part of the source table:
OrderID | Customer | SellerId | OrderDate | Amount |
---|---|---|---|---|
10400 | EASTC | 1 | 2014/01/01 | 3063.0 |
10401 | HANAR | 1 | 2014/01/01 | 3868.6 |
10402 | ERNSH | 8 | 2014/01/02 | 2713.5 |
10403 | ERNSH | 4 | 2014/01/03 | 1005.9 |
10404 | MAGAA | 2 | 2014/01/03 | 1675.0 |
… | … | … | … | … |
【SPL script】
A | B | |
---|---|---|
1 | =connect("db").query("select * from sales") | / Connect to the data source and read in the sales table |
2 | =A1.select(year(OrderDate)==2014) | / Select data of 2014 |
3 | =A2.iterate((@+=Amount, ~~=OrderDate),0,@>150000) | / iterate() function performs the iterative computation with the initial value as 0 and returns the desired target date. Sum sales amounts to the current cell and until the total reaches 150,000. |
Below is A3’s result:
Value |
---|
2014/03/25 |
There are more examples in SPL CookBook.
SPL Resource: SPL Official Website | SPL Blog | Download esProc SPL | SPL Source Code