diff --git a/GraphQL_intro.md b/GraphQL_intro.md
new file mode 100644
index 00000000..52f91c60
--- /dev/null
+++ b/GraphQL_intro.md
@@ -0,0 +1,185 @@
+---
+layout: default
+title: GraphQL & API
+parent: Usage
+nav_order: 2
+permalink: /usage/graphql
+---
+# GraphQL & API
+{: .no_toc }
+
+## Table of contents
+{: .no_toc .text-delta }
+
+1. TOC
+{:toc}
+
+---
+# Introduction to GraphQL and querying the API
+
+GraphQL is a query language for Application Programming Interfaces (APIs), which documents what data is available in the API and allows to query and get exactly the data we want and nothing more.
+
+This tutorial provides a short introduction to GraphQL, but we recommend you to explore the [GraphQL documentation](https://graphql.org/learn/) and other [introductory resources like this one](https://docs.github.com/en/graphql/guides/introduction-to-graphql) to learn more.
+
+In the GraphQL API Queries are written in the GraphQL language, and the result (the data) is given back in [JSON](https://www.w3schools.com/whatis/whatis_json.asp) format. JSON (from JavaScript Object Notation) is a standard text-based format for representing structured data. It is widely used for transmitting data in web applications, and it can easily be reformatted into tables or data frames within programming languages like R or Python.
+
+Zendro provides a GraphQL API web interface, called Graph**i**QL, which is a Web Browser tool for writing, validating, and testing GraphQL queries.
+
+You can live-try an example here, which is the API that we will be using in this and other tutorials: [https://zendro.conabio.gob.mx/api/graphql](https://zendro.conabio.gob.mx/api/graphql).
+
+Zendro's GraphQL API allows not only to query the data, but also to create, modify or delete records (`mutate`). This is available only with authentication (i.e. logging in with edit permissions), and it won't be covered in this tutorial, but you can check other Zendro's How to guides for details on mutations.
+
+## GraphiQL web interface
+
+The GraphiQL API web interface has the following main components:
+
+* A **left panel** where you can write your query in GraphQL format.
+* A **right panel** where the result of the query is provided in JSON format.
+* A **play button** to execute the query. Its keyboard shortcut is `Ctr+E`.
+* A **Documentation Explorer** side menu, which you can show or hide clicking on "Docs" in the top right corner.
+
+
+
+
+Data in GraphQL is organised in **types** and **fields** within those types. When thinking about your structured data, you can think of **types as the names of tables**, and **fields as the columns of those tables**. The records will be the rows of data from those tables. You can learn more in the [GraphQL documentation](https://graphql.org/learn/).
+
+A GraphQL service is created by defining types and fields on those types, and providing functions for each field on each type.
+
+The documentation explorer allows to examine what operations (e.g. query, mutation) are allowed for each type. Then, clicking on `Query` will open another view with the details of what operations are available to query the data. In this view, all types available in a given dataset are listed in alphabetical order, with the operations than can be done within them listed below.
+
+In the example of the image above, we can see that the first type is `cities`. Types can contain elements or arguments, which are specified inside parentheses `()`. Some of these may be required arguments (marked with `!`), such as `pagination`.
+
+You can extend the bottom panel ("Query variables") to provide dynamic inputs to your query. [Learn more](https://graphql.org/learn/queries/#variables).
+
+## Writing queries
+
+The [GraphQL documentation](https://graphql.org/learn/) includes plenty of resources to learn how to build queries and make the most out of the power of GraphQL. Below we provide just a short summary, after which we recommend you to explore the [GraphQL documentation](https://graphql.org/learn/) to learn more. Feel free to try your queries in our [Zendro Dummy API](https://zendro.conabio.gob.mx/dummy_api) we set up for tests.
+
+**GraphQL syntax tips:**
+
+* Queries and other operations are written between curly braces `{}`.
+* Types can contain elements or arguments, which are specified inside parentheses `()`.
+* Use a colon `:` to set parameter arguments (e.g. `pagination:{limit:10, offset:0}`).
+* Use a hashtag `#` to include comments within a query, which are useful to document what you are doing.
+* A query should provide at least one type (e.g. `rivers`), at least one a field (e.g. `names`), and any mandatory arguments the types have (marked with `!` in the Docs).
+* In Zendro `pagination` is a mandatory argument. It refers to the number of records (`limit`) the output returns, starting from a given `offset`. If you don't specify the offset, by default this will be `offset:0`
+
+A simple query will look like this:
+
+```
+{rivers(pagination:{limit:10, offset:0}){
+ # fields you want from the "rivers" type go here
+ name
+ }
+}
+```
+
+Copy-pasting and executing the former query in GrapiQL looks like the following image. That is, we got the names of the first 10 rivers of the data :
+
+
+
+But how did we know that `name` is a field within `rivers`? There are two options:
+
+**Option 1: Check the Docs panel**
+
+Click on `Query`, then look for the type you want (in this example `rivers`), and then click on `[river]`. This will open the list of fields available for it, along with their documentation:
+
+
+
+
+**Option 2: autocomplete while you type**
+
+If you want to know what fields are available for the type `rivers` you can hold `ctrl+space` within the curly braces `{}` after `rivers(pagination:{limit:10, offset:0})`. A menu will appear showing you all possible fields.
+
+
+
+In this example, we can get the fields `river_id`, `name` and `country_ids`. The rest of the list is related to `countries`, because `rivers` is associated with `countries`, and therefore we can build a more complex query with them.
+
+But first, lets build a query to give us back the fields `river_id`, `name` and `country_ids` from the type `river`, like this:
+
+```
+{rivers(pagination:{limit:10, offset:0}){
+ river_id
+ name
+ length
+ country_ids
+ }
+}
+```
+
+As a result of the query, for each of the 10 first rivers (10 because we set `limit:10`) of the data we will get its id, name, length, and the id of any country it is associated to:
+
+
+
+### Extracting data from different types (i.e. dealing with associations)
+
+GraphQL can get fields associated with a record in different types, allowing us to get the data with only the variables and records we need form the entire dataset. For example, we can get the name and length of a river, but also the name and population of the countries it crosses.
+
+Extracting data from associated types depends on if the association is *one to one* (a city belongs to one country) or *one to many* (a river can cross many countries).
+
+#### One to one
+
+When the association is *one to one* the associated data model will apear as just another field, . For example each `city` is associated with one `country`, therefore `country` is one of the fields available within `cities`.
+
+If you look at the Docs, you will notice that it is not just another field, but that you need to provide it with an input search.
+
+
+
+In this case we want to look for what country this is associated, and we know that the field in common (i.e. the key) is the `country_id`, therefore your search should look like:
+
+```
+{
+cities(pagination:{limit:10, offset:0}){
+ city_id
+ name
+ population
+ country(search:{field:country_id}){
+ name
+ population
+ }
+ }
+ }
+```
+
+
+#### One to many
+
+When the association is *one to many* there would be a `Connection` for each each association the model has. For example, to see the countries a river is associated to we need to use `countriesConnection`:
+
+```
+{rivers(pagination:{limit:10, offset:0}){
+ river_id
+ name
+ length
+ country_ids
+ countriesConnection(pagination:{first:1}){
+ countries{
+ name
+ population}
+ }
+ }
+}
+```
+
+Remember to check the Docs for any mandatory argument. In this case `pagination` is mandatory. You can check what you are expected to write in its `paginationCursorInput` by clicking on it in the documentation. Also check the [pagination documentation](https://zendro-dev.github.io/api_root/graphql#pagination-argument) for details on how to use this argument.
+
+After you execute the query, you will get the same data we got for each river before, but also the data of the country (or countries, if it were the case) it is associated to.
+
+
+
+
+In the above examples all the arguments are inside the query string. But the arguments to fields can also be dynamic, for instance there might be a dropdownn menu in an application that lets the user select which City the user is interested in, or a set of filters.
+
+To improve run time, GraphQL can factor dynamic values out of the query, and pass them as a separate dictionary. These values are called **variables**. Common variables include search, order and pagination. To work with variables you need to do three things:
+
+1. Replace the static value in the query with `$variableName`
+2. Declare `$variableName` as one of the variables accepted by the query
+3. Pass `variableName: value` in the separate, transport-specific (usually JSON) variables dictionary
+
+Check the [official documentation](https://graphql.org/learn/queries/#variables) for examples.
+
+As you can see, you can write much more complex queries to get the data you want. Please explore the [GraphQL documentation](https://graphql.org/learn/) or many other resources out there to learn more. The above examples should get you going if you want to get data to perform analyses in R or Python.
+
+Before trying to download data from R, Python or any other programming language using the GraphQL API, we recommend writing the query to the GraphiQL web interface and making sure it returns the desired data as in the right panel in the image above.
+
+**Next step?** Check Zendro How to guides for tutorials on how to use the GraphQL API from R or Python to explore and analyse data stored in Zendro.
diff --git a/README.md b/README.md
index 35028905..bb839add 100644
--- a/README.md
+++ b/README.md
@@ -8,41 +8,74 @@ Zendro is a software tool to quickly create a data warehouse tailored to your sp
Zendro consists of two main components, backend and frontend. The backend component has its [base project](https://github.com/ScienceDb/graphql-server) and a [code generator](https://github.com/ScienceDb/graphql-server-model-codegen). The frontend of SPA (Single Page Application) also has its [base project](https://github.com/ScienceDb/single-page-app).
See the guides below on how to use Zendro.
-Also find Zendro-dev on [github](https://github.com/Zendro-dev).
+To see or contribute to our code please visit Zendro-dev on [github](https://github.com/Zendro-dev), where you can find the repositories for:
-If you have any questions or comments, please don't hesitate to contact us via an issue [here](https://github.com/Zendro-dev/Zendro-dev.github.io/issues). Tag your issue as a question and we will try to answer as quick as possible.
+* [GraphQL server](https://github.com/ScienceDb/graphql-server)
+* [GraphQL server model generator](https://github.com/ScienceDb/graphql-server-model-codegen)
+* [Single page application](https://github.com/ScienceDb/single-page-app)
+
+If you have any questions or comments, please don't hesitate to contact us via an issue [here](https://github.com/Zendro-dev/Zendro-dev.github.io/issues). Tag your issue as a question or bug and we will try to answer as quick as possible.
+
+## SHOW ME HOW IT LOOKS!
+
+Would you like to see Zendro in action before deciding to learn more? That's fine! We set up a Dummy Zendro Instance for you to explore [Zendro's graphical user interface]( https://zendro.conabio.gob.mx/spa) and [Zendro's API]( https://zendro.conabio.gob.mx/graphiql). The tutorials on how to [use Zendro day to day](#using-zendro-day-to-day) of the section below use this instance, so go there to start exploring.
-[](https://github.com/Zendro-dev/Zendro-dev.github.io/blob/master/quickstart.md)
+### Installation and sysadmin
-[](https://github.com/Zendro-dev/Zendro-dev.github.io/blob/master/setup_root.md)
+To start trying Zendro you can try the [Quickstart tutorial](https://zendro-dev.github.io/quickstart.html) on how to create a new Zendro project with pre-defined datamodels, database and environment variables. Then you can try the [Getting started tutorial](https://zendro-dev.github.io/setup_root.html), a step-by-step guide on how to create a new Zendro project from scratch, aimed at software developers and system administrators.
+
+[](quickstart.md)
+
+[
](setup_root.md)
+
+For more sysadmin details also check:
### HOW-TO GUIDES:
-* [How to define data models: for developers](setup_data_scheme.md). Detailed technical specifications on how to define data models for Zendro, aimed at software developers and system administrators.
-* [How to define data models: for non-developers](non-developer_documentation.md). A brief, illustrated guide of data model specifications, data formatting and data uploading options, aimed at data modelers or managers to facilitate collaboration with developers.
-* [How to setup a distributed cloud of zendro nodes](ddm.md). A brief guide, aimed at software developers and system administrators, on how to use Zendros distributed-data-models.
* [How to use Zendro command line interface (CLI)](zendro_cli.md). A tutorial of Zendro CLI, aimed at software developers.
-* [How to query and extract data](fromGraphQlToR.html). A concise guide on how to use the Zendro API from R to extract data and perform queries, aimed at data managers or data scientists.
* [How to setup Authentication / Authorization](oauth.md). A concise guide on how to use and setup the Zendro authorization / authentication services.
-* [API documentation](api_root.md).
+* [How to setup a distributed cloud of Zendro nodes](ddm.md). A brief guide, aimed at software developers and system administrators, on how to use Zendros distributed-data-models.
+* [API documentation](api_root.md). A summary of how Zendro backend generator implements a CRUD API that can be accessed through GraphQL query language.
-### REPOSITORIES:
+### Defining data models
-* [GraphQL server](https://github.com/ScienceDb/graphql-server)
-* [GraphQL server model generator](https://github.com/ScienceDb/graphql-server-model-codegen)
-* [Single page application](https://github.com/ScienceDb/single-page-app)
+* [How to define data models: for developers](setup_data_scheme.md). Detailed technical specifications on how to define data models for Zendro, aimed at software developers and system administrators.
+* [How to define data models: for non-developers](what_are_data_models.md). A brief, illustrated guide of data model specifications, data formatting and data uploading options, aimed at data modelers or managers to facilitate collaboration with developers.
+
+### Using Zendro day to day
+
+* [How to use Zendro's graphical interface](SPA_howto.md). A full guide on how to use Zendro's graphical point and click interface. Aimed to general users and featuring lots of screenshots.
+* [Introduction to GraphQL and querying the API](GraphQL_intro.md). A friendly intro to how to perform GraphQL queries and use GraphiQL documentation.
+* [How to query and extract data from R](fromGraphQlToR.html). A concise guide on how to use the Zendro API from R to extract data and perform queries, aimed at data managers or data
+* [How to use the Zendro API with python to make changes to the data](Zendro_requests_with_python.md). A concise guide on how to access the API using your user credentials to make CRUD operations on the data using python.
+
+## Zendro users profiles
-### CONTRIBUTIONS
+We designed Zendro to be useful for research teams and institutions that include users with different areas of expertise, needs and type of activities. The table below summarizes how we envision that different users will use Zendro:
+
+
+| Profile | Background | Expected use |
+|-------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| General user / scientist |
GraphQL outputs the resutls of a query in batches of max 1,000
+elements. So if the data you want to download is larger than that, then
+you need to paginate, i.e. to get the data in batches.
+pagination
is is an argument within GraphQL queries that
+could be done by:
Limit-offset: indicating the first element to get
+(offset
, default 0) and the number of elements to get
+(limit
). The limit
can’t be larger than
+1000
.
Cursor-based: indicating the unique ID
+(cursor
) of the element to get first, and a number of
+elements to get after.
Zendro uses the limit-offset pagination with the syntaxis:
+pagination:{limit:[integer], offset:[integer]}
See GraphQL +documentation and this tutorial on GraphQL +pagination for more details.
+In the previous examples we downloaded only 10 elements
+(pagination:{limit:10})
) from the rivers type, but the
+dataset is larger. (Remember, data in GraphQL is organised in
+types and fields within those types.
+When thinking about your structured data, you can think of types as the
+names of tables, and fields as the columns of those tables. In the
+example above rivers
is a type and the fields are
+river_id
, name
, length
among
+others.)
To know how many elements does a type has we can make a query with
+the function count
, if it is available for the type we are
+interested on. You can check this in the Docs
at the top
+right menu of the GraphiQL interface.
For example, rivers
has the function
+countRivers
so with the query {countRivers}
we
+can get the total number of rivers.
Similar to how we got data before, you can use this very simple query
+in the function get_from_graphQL
to get the number of
+rivers into R:
# query API with count function
+no_records<-get_from_graphQL(query="{countRivers}", url="https://zendro.conabio.gob.mx/api/graphql")
+
+# change to vector, we don't need a df
+no_records<-no_records[1,1]
+no_records
+## [1] 50
+In this case we have 50. Technically we could download all the data +in a single batch because it is <1000, but for demostration purposes +we will download it in batches of 10.
+The following code calculates the number of pages needed to get a
+given number of records assuming a desired limit (size of each batch).
+Then it runs get_from_graphQL()
within a loop for each page
+until getting the total number of records desired.
# Define desired number of records and limit. Number of pages and offset will be estimated based on the number of records to download
+no_records<- no_records # this was estimated above with a query to count the total number of records, but you can also manually change it to a custom desired number
+my_limit<-10 # max 1000.
+no_pages<-ceiling(no_records/my_limit)
+
+## Define offseet.
+# You can use the following loop:
+# to calculate the offset automatically based on
+# on the number of pages needed.
+my_offset<-0 # start in 0. Leave like this
+for(i in 1:no_pages){ # loop to
+ my_offset<-c(my_offset, my_limit*i)
+}
+
+# Or you can define the offset manually
+# uncommenting the following line
+# and commenting the loop above:
+# my_offset<-c(#manually define your vector)
+
+## create object where to store downloaded data. Leave empty
+data<-character()
+
+##
+## Loop to download the data from GraphQL using pagination
+##
+
+for(i in c(1:length(my_offset))){
+
+# Define pagination
+pagination <- paste0("limit:", my_limit, ", offset:", my_offset[i])
+
+# Define query looping through desired pagination:
+my_query<- paste0("{
+ rivers(pagination:{", pagination, "}){
+ river_id
+ name
+ length
+ }
+ }
+ ")
+
+
+
+# Get data and add it to the already created df
+data<-rbind(data, get_from_graphQL(query=my_query, url="https://zendro.conabio.gob.mx/api/graphql"))
+
+#end of loop
+}
+As a result you will get all the data in a single df:
+head(data)
+summary(data)
+## river_id name length
+## Length:50 Length:50 Min. : 65.0
+## Class :character Class :character 1st Qu.: 150.0
+## Mode :character Mode :character Median : 283.0
+## Mean : 347.1
+## 3rd Qu.: 402.5
+## Max. :1521.0
+## NA's :6
+get_from_graphQL()
explained step by stepThe following is a step-by-step example explaining with more detail
+how does the function get_from_graphQL()
that we used above
+works.
First, once you have a GraphQL query working, you’ll need to save it +to an R object as a character vector:
+my_query<- "{
+rivers(pagination:{limit:10, offset:0}){
+ river_id
+ name
+ length
+ }
}
-' # write the query as a string
-result <- POST(url, body = list(query=accessions_query)) # fetch data
-The result
that we are getting is the http
response. We still need to extract the data in order to be able to manipulate it. If everything went well, the http
response will contain an attribute data
which will itself contain an attribute named as the query, in this case accessions
.
jsonResult <- content(result, as = "text")
-readableText <- fromJSON(jsonResult)
-readableText$data$accessions
+"
+Next, define as another character vector the url of the API, which is +the same of the GraphiQL web interface you explored above:
+url<-"https://zendro.conabio.gob.mx/api/graphql"
+Now we can a query to the API by using a POST request:
+# query server
+result <- POST(url, body = list(query=my_query), encode = c("json"))
+The result that we are getting is the http
response.
+Before checking if we got the data, it is good practice to verify if the
+connection was successful by checking the status code. A
+200
means that all went well. Any other code means
+problems. See this.
# check server response
+result$status_code
+## [1] 200
+We now need to extract the data in order to be able to manipulate it.
+If everything went well, the http
response will contain an
+attribute data which will itself contain an attribute named as the
+query, in this case rivers
.
result
+## Response [https://zendro.conabio.gob.mx/api/graphql]
+## Date: 2022-07-27 23:15
+## Status: 200
+## Content-Type: application/json; charset=utf-8
+## Size: 983 B
+## {
+## "data": {
+## "rivers": [
+## {
+## "river_id": "1",
+## "name": "Acaponeta",
+## "length": 233
+## },
+## {
+## "river_id": "10",
+## ...
+If the query is not written properly or if there is any other error,
+the attribute data
won’t exist and instead we will get the
+attribute erros
listing the errors found.
If all wen’t well we can proceed to extract the content of the +results with:
+# get data from query result
+jsonResult <- content(result, as = "text")
+The result will be in json format, which we can convert into an
+Robjet (list). In this list the results are within each type used in the
+query. The argumment flatten
is used to collapse the list
+into a single data frame the data from different types.
# transform to json
+readableResult <- fromJSON(jsonResult,
+ flatten = T)
+Extract data:
+# get data
+data<-as.data.frame(readableResult$data[1])
+head(data)
With the above code we were able to visualize the data fetched from zendro server. Next we will put this data in a table form in order to be able to manipulate the data.
-dataTable <- lapply(readableText[[1]], as.data.table)
-dataTable$accessions
+By default, the name of each type will be added a the beggining of +each column name:
+colnames(data)
+## [1] "rivers.river_id" "rivers.name" "rivers.length"
+To keep only the name of the variable as it is in the original +data:
+x<-str_match(colnames(data), "\\w*$")[,1] # matches word characters (ie not the ".") at the end of the string
+colnames(data)<-x # assing new colnames
+So finally we have the data in a single nice looking data frame:
+head(data)
Notice that you will get a dataframe like teh one above only for one +to one associations, but than in other cases you still will get +variables that are a list, which you can process in a separate step.