GSoC 2018: Second Coding phase
Hello there! Today, we will discuss about the whole second coding phase work. As you know, the second coding phase of GSoC is over and I am a student under Python Hydra with project “implement Redis as a datastore”. So, In second coding phase my main motive or task was to implement a querying mechanism for the data that we have stored in Redis from the server in the form of graph using redisgraph. And with the help of querying mechanism user can query the data easily and in efficient way. During this period, I have learned a lot about Redis and other python or coding stuffs like design pattern and etc.
For the querying process, first we have to re-factor the code of first coding phase in a way that graph should be load on its requirement or use lazyloading of graph. So, for that problem initially we load only the initial subgraph in which we are loading only the entrypoint, collection endpoints and class endpoints. After that, graph will load on the based of query done by user (we will discuss it later).
After the lazyloading of graph, we have to find a querying format. After some research we define a format like:
1. for all class endpoints >>> show classEndpoints
2. for all collection endpoints >>>> show collectionEndpoints
3. for all endpoints class+collection >>> show endpoints
4. for access members of any collection >>> show <endpoint type> members , Ex: show DroneCollection members
5. for access all properties of specific member >>> show objects<endpoint type> properties , similar for operation.
6. for properties of member with values >>> show objects<endpoint type> property_value
7. for all properties of any class >>> show class<endpoint type> properties ,similar for operation.
8. for all properties of class with values >>> show class<endpoint type> property_value
9. for object property >>> show object<id of member> properties
10. for comparison of properties >>> show <property_key> <property_value> and/or.....
Ex: show model xyz and name Drone1
So, After the querying format I have to implement it. Implementation gone like that if user query for first three queries( in querying format above) then we can directly query from the initial graph stored in Redis. But now if user query for 4–9 types of query format, we have to load the data from the server and store it in Redis as a graph once for the given endpoint. And after once load the data in Redis, client will query from the Redis directly for specific endpoint. We are using OpenCypher(querying language) for querying in redisgraph.
Now if user query for the last type of query, we have a special type of indexing in querying mechanism for that’s type of query called faceted indexing on the based of properties and its values. Ex: property like model
and value is xyz
. So, faceted index for that should be like fs:model:xyz
and it was like a key and the value for that key, should be the “endpoint type” or “member id” . It can be done like this sadd fs:model:xyz <member_id>
here sadd
use for the add a element in a set( Redis data structure).
Now, Client works fine. There are some features issues, I am still working on them. And for more detail about client and querying mechanism you should go through:
Merged PR’s during this period:
- Update README.md and remove hydra-py
- Remove hydra-py.py(old) code
- Update hydra_graph.py
- update setup.py
- update readme
- Add LICENSE
- Implement querying mechanism
- Add uuid instead of id(pk)
Thanks!
Sandeep Chauhan