About the Author
Man, are you hungry? Senior development engineer. iOS, Go, Java are all involved. Currently focusing on big data development. Like riding and climbing.
Foreword: For the application scenarios of the location service field of "People nearby", it is common to use PG, MySQL, and MongoDB to implement spatial indexes. And Redis has another path, combined with its ordered queue zset and geohash coding, to achieve the spatial search function, and has extremely high operating efficiency. This article analyzes its algorithm principle from the perspective of source code, and estimates the query time complexity.
To provide a complete "People Nearby" service, the most basic thing is to implement the functions of "add", "delete" and "check". The following will be introduced separately, which will focus on the analysis of query functions.
Operation command
Since Redis 3.2, Redis has provided geographyrelated functions based on geohash and ordered collections . The Redis Geo module contains the following 6 commands:
 GEOADD : add the given location object (latitude, longitude, name) to the specified key;
 GEOPOS: returns the position (longitude and latitude) of all the given position objects from the key;
 GEODIST: returns the distance between two given locations;
 GEOHASH: returns a Geohash representation of one or more location objects;
 GEORADIUS : returns all position objects in the target set whose distance from the center does not exceed the given maximum distance with the given latitude and longitude as the center;
 GEORADIUSBYMEMBER: Returns all position objects with the given position object as the center, within a given distance.
Among them, the combined use of GEOADD and GEORADIUS can realize the basic functions of "increase" and "check" in "people nearby". To achieve the function of "People nearby" in WeChat, you can directly use the GEORADIUSBYMEMBER command. The "given location object" is the user himself, and the search object is other users. However, in essence, GEORADIUSBYMEMBER = GEOPOS + GEORADIUS, that is, first find the user's location, and then use this location to search for other user objects in the vicinity that meet the conditions of mutual distance between locations.
The following will analyze the GEOADD and GEORADIUS commands from the perspective of source code, and analyze the algorithm principle.
Redis geo operations only include "increase" and "check" operations, and there is no special "delete" command. Mainly because Redis uses an ordered set (zset) to store location objects, which can be deleted using zrem.
In the comments of the Redis source code geo.c, it is only stated that this file is an implementation file of GEOADD, GEORADIUS, and GEORADIUSBYMEMBER (in fact, the other three commands are also implemented). Seen from the side, the other three commands are auxiliary commands.
GEOADD
How to use
GEOADD key longitude latitude member [longitude latitude member ...] 复制代码
Adds the given position object (latitude, longitude, name) to the specified key.
Where key is the name of the collection and member is the object corresponding to the latitude and longitude. In practical applications, when there are too many objects to be stored, you can sharding the collection of objects in disguise by setting multiple keys (such as one province and one key) to avoid too many single collections.
Return value after successful insertion:
( integer ) N 复制代码
Where N is the number of successful insertions.
Source code analysis
/* GEOADD key long lat name [long2 lat2 name2 ... longN latN nameN] */ void geoaddCommand(client *c) { //参数校验/* Check arguments number for sanity. */ if ((c>argc  2) % 3 != 0) { /* Need an odd number of arguments if we got this far... */ addReplyError(c, "syntax error. Try GEOADD key [x1] [y1] [name1] " "[x2] [y2] [name2] ... " ); return ; } //参数提取Redis int elements = (c>argc  2) / 3; int argc = 2+elements*2; /* ZADD key score ele ... */ robj **argv = zcalloc(argc*sizeof(robj*)); argv[0] = createRawStringObject( "zadd" ,4); argv[1] = c>argv[1]; /* key */ incrRefCount(argv[1]); //参数遍历+转换/* Create the argument vector to call ZADD in order to add all * the score,value pairs to the requested zset, where score is actually * an encoded version of lat,long. */ int i; for (i = 0; i < elements; i++) { double xy[2]; //提取经纬度if (extractLongLatOrReply(c, (c>argv+2)+(i*3),xy) == C_ERR) { for (i = 0; i < argc; i++) if (argv[i]) decrRefCount(argv[i]); zfree(argv); return ; } //将经纬度转换为52位的geohash作为分值& 提取对象名称/* Turn the coordinates into the score of the element. */ GeoHashBits hash ; geohashEncodeWGS84(xy[0], xy[1], GEO_STEP_MAX, & hash ); GeoHashFix52Bits bits = geohashAlign52Bits( hash ); robj *score = createObject(OBJ_STRING, sdsfromlonglong(bits)); robj *val = c>argv[2 + i * 3 + 2]; //设置有序集合的对象元素名称和分值argv[2+i*2] = score; argv[3+i*2] = val; incrRefCount(val); } //调用zadd命令，存储转化好的对象/* Finally call ZADD that will do the work for us. */ replaceClientCommandVector(c,argc,argv); zaddCommand(c); } 复制代码
Through source code analysis, it can be seen that Redis uses an ordered set (zset) to store location objects. Each element in the ordered set is an object with a position. The score value of the element is a 52bit geohash value corresponding to its latitude and longitude.
Double type precision is 52 digits;
Geohash is encoded in base32. 52bits can store up to 10 geohash values, corresponding to a grid with a size of 0.6 * 0.6 meters. In other words, the position transformed by Redis geo would theoretically have an error of about 0.3 * 1.414 = 0.424 meters.
Algorithm summary
A brief summary of what the GEOADD command does:
1.Parameter extraction and verification;
2. Convert the input latitude and longitude to a 52bit geohash value (score);
3. Call the ZADD command to store member and its corresponding score in the set key.
GEORADIUS
How to use
GEORADIUS key longitude latitude radius mkmftmi [WITHCOORD] [WITHDIST] [WITHHASH] [ASCDESC] [COUNT count] [STORE key] [STORedisT key] 复制代码
With the given latitude and longitude as the center, return all position objects in the target set whose distance from the center does not exceed the given maximum distance.
Range unit: m  km  ft  mi> meters  kilometers  feet  miles Additional parameters:
WITHDIST: When the position object is returned, the distance between the position object and the center is also returned. The unit of distance is consistent with the range unit given by the user.
WITHCOORD: Returns the longitude and latitude of the location object as well.
WITHHASH: Returns the ordered set score of the location object as a raw geohash code in the form of a 52bit signed integer. This option is mainly used for lowlevel applications or debugging. It is not very useful in practice.
ASC  DESC: Returns position object elements from near to far  Returns position object elements from far to near. COUNT count: Select the first N matching position object elements. (If not set, all elements will be returned)STORE key: Save the returned geographic location information to the specified key. STORedisT key: Save the distance from the center point of the returned result to the specified key.
Due to the existence of the STORE and STORedisT options, the GEORADIUS and GEORADIUSBYMEMBER commands are technically marked as write commands, which will only query (write) the master instance. When the QPS is too high, it will easily cause the master instance to read and write too much pressure. To solve this problem, in Redis 3.2.10 and Redis 4.0.0, two readonly commands, GEORADIUS_RO and GEORADIUSBYMEMBER_RO, were added respectively.
However, in actual development, the author found that the GeoRadiusParam parameter class of the java packageRedis.clients.jedis.params.geo
does not include the STORE and STORedisT parameter options. Did you really only query the main instance when calling georadius? Readonly packaged. Interested friends can study on their own.
Return value after successful query:
Without WITH qualification, return a member list, such as:
[ "member1" , "member2" , "member3" ] 复制代码
With WITH qualification, each member in the member list is also a nested list, such as:
[ [ "member1" , distance1, [longitude1, latitude1]] [ "member2" , distance2, [longitude2, latitude2]] ] 复制代码
Source code analysis
This section of the source code is longer. If you can't read it, you can read the Chinese comments directly, or skip to the summary section.
/* GEORADIUS key xy radius unit [WITHDIST] [WITHHASH] [WITHCOORD] [ASCDESC] * [COUNT count] [STORE key] [STORedisT key] * GEORADIUSBYMEMBER key member radius unit ... options ... */ void georadiusGeneric(client *c, int flags) { robj *key = c>argv[1]; robj *storekey = NULL; int stoRedist = 0; /* 0 for STORE, 1 for STORedisT. */ //根据key获取有序集合robj *zobj = NULL; if ((zobj = lookupKeyReadOrReply(c, key, shared.null[c>resp])) == NULL  checkType(c, zobj, OBJ_ZSET)) { return ; } //根据用户输入（经纬度/member）确认中心点经纬度int base_args; double xy[2] = { 0 }; if (flags & RADIUS_COORDS) { …… } //获取查询范围距离double radius_meters = 0, conversion = 1; if ((radius_meters = extractDistanceOrReply(c, c>argv + base_args  2, &conversion)) < 0) { return ; } //获取可选参数（withdist、withhash、withcoords、sort、count） int withdist = 0, withhash = 0, withcoords = 0; int sort = SORT_NONE; long long count = 0; if (c>argc > base_args) { ... ... } //获取STORE 和STORedisT 参数if (storekey && (withdist  withhash  withcoords)) { addReplyError(c, "STORE option in GEORADIUS is not compatible with " "WITHDIST, WITHHASH and WITHCOORDS options" ); return ; } //设定排序if (count != 0 && sort == SORT_NONE) sort = SORT_ASC; //利用中心点和半径计算目标区域范围GeoHashRadius georadius = geohashGetAreasByRadiusWGS84(xy[0], xy[1], radius_meters); //对中心点及其周围8个geohash网格区域进行查找，找出范围内元素对象geoArray *ga = geoArrayCreate(); membersOfAllNeighbors(zobj, georadius, xy[0], xy[1], radius_meters, ga); //未匹配返空/* If no matching results, the user gets an empty reply. */ if (ga>used == 0 && storekey == NULL) { addReplyNull(c); geoArrayFree(ga); return ; } //一些返回值的设定和返回…… geoArrayFree(ga); } 复制代码
There are two core steps in the above code. One is to calculate the range of the center point, and the other is to find the center point and its surrounding 8 geohash grid regions. Corresponding to the two functions of geohashGetAreasByRadiusWGS84
and membersOfAllNeighbors
. Let's take a look:
 Calculate the center point range:
// geohash_helper.c
GeoHashRadius geohashGetAreasByRadiusWGS84(double longitude, double latitude, double radius_meters) { return geohashGetAreasByRadius(longitude, latitude, radius_meters); } //返回能够覆盖目标区域范围的9个geohashBox GeoHashRadius geohashGetAreasByRadius(double longitude, double latitude, double radius_meters) { //一些参数设置GeoHashRange long_range, lat_range; GeoHashRadius radius; GeoHashBits hash ; GeoHashNeighbors neighbors; GeoHashArea area; double min_lon, max_lon, min_lat, max_lat; double bounds[4]; int steps; //计算目标区域外接矩形的经纬度范围（目标区域为：以目标经纬度为中心，半径为指定距离的圆） geohashBoundingBox(longitude, latitude, radius_meters, bounds); min_lon = bounds[0]; min_lat = bounds[1]; max_lon = bounds[2]; max_lat = bounds[3]; //根据目标区域中心点纬度和半径，计算带查询的9个搜索框的geohash精度（位） //这里用到latitude主要是针对极地的情况对精度进行了一些调整（纬度越高，位数越小） steps = geohashEstimateStepsByRadius(radius_meters,latitude); //设置经纬度最大最小值：180<=longitude<=180, 85<=latitude<=85 geohashGetCoordRange(&long_range,&lat_range); //将待查经纬度按指定精度（steps）编码成geohash值geohashEncode(&long_range,&lat_range,longitude,latitude,steps,& hash ); //将geohash值在8个方向上进行扩充，确定周围8个Box（neighbors） geohashNeighbors(& hash ,&neighbors); //根据hash值确定area经纬度范围geohashDecode(long_range,lat_range, hash ,&area); //一些特殊情况处理…… //构建并返回结果radius.hash = hash ; radius.neighbors = neighbors; radius.area = area; return radius; } 复制代码
 Find the center point and its surrounding 8 geohash grid areas:
// geo.c
//在9个hash Box中获取想要的元素int membersOfAllNeighbors(robj *zobj, GeoHashRadius n, double lon, double lat, double radius, geoArray *ga) { GeoHashBits neighbors[9]; unsigned int i, count = 0, last_processed = 0; int debugmsg = 0; //获取9个搜索hash Box neighbors[0] = n.hash; …… neighbors[8] = n.neighbors.south_west; //在每个hash Box中搜索目标点for (i = 0; i < sizeof(neighbors) / sizeof(*neighbors); i++) { if (HASHISZERO(neighbors[i])) { if (debugmsg) D( "neighbors[%d] is zero" ,i); continue ; } //剔除可能的重复hash Box (搜索半径>5000KM时可能出现) if (last_processed && neighbors[i].bits == neighbors[last_processed].bits && neighbors[i].step == neighbors[last_processed].step) { continue ; } //搜索hash Box中满足条件的对象count += membersOfGeoHashBox(zobj, neighbors[i], ga, lon, lat, radius); last_processed = i; } return count; } int membersOfGeoHashBox(robj *zobj, GeoHashBits hash , geoArray *ga, double lon, double lat, double radius) { //获取hash Box内的最大、最小geohash值（52位） GeoHashFix52Bits min, max; scoresOfGeoHashBox( hash ,&min,&max); //根据最大、最小geohash值筛选zobj集合中满足条件的点return geoGetPointsInRange(zobj, min, max, lon, lat, radius, ga); } int geoGetPointsInRange(robj *zobj, double min, double max, double lon, double lat, double radius, geoArray *ga) { //搜索Range的参数边界设置（即9个hash Box其中一个的边界范围） zrangespec range = { .min = min, .max = max, .minex = 0, .maxex = 1 }; size_t origincount = ga>used; sds member; //搜索集合zobj可能有ZIPLIST和SKIPLIST两种编码方式，这里以SKIPLIST为例，逻辑是一样的if (zobj>encoding == OBJ_ENCODING_ZIPLIST) { …… } else if (zobj>encoding == OBJ_ENCODING_SKIPLIST) { zset *zs = zobj>ptr; zskiplist *zsl = zs>zsl; zskiplistNode *ln; //获取在hash Box范围内的首个元素（跳表数据结构，效率可比拟于二叉查找树），没有则返0 if ((ln = zslFirstInRange(zsl, &range)) == NULL) { /* Nothing exists starting at our min. No results. */ return 0; } //从首个元素开始遍历集合while (ln) { sds ele = ln>ele; //遍历元素超出range范围则break /* Abort when the node is no longer in range. */ if (!zslValueLteMax(ln>score, &range)) break ; //元素校验（计算元素与中心点的距离） ele = sdsdup(ele); if (geoAppendIfWithinRadius(ga,lon,lat,radius,ln>score,ele) == C_ERR) sdsfree(ele); ln = ln>level[0].forward; } } return ga>used  origincount; } int geoAppendIfWithinRadius(geoArray *ga, double lon, double lat, double radius, double score, sds member) { double distance, xy[2]; //解码错误, 返回error if (!decodeGeohash(score,xy)) return C_ERR; /* Can 't decode. */ //最终距离校验(计算球面距离distance看是否小于radius) if (!geohashGetDistanceIfInRadiusWGS84(lon,lat, xy[0], xy[1], radius, &distance)) { return C_ERR; } //构建并返回满足条件的元素geoPoint *gp = geoArrayAppend(ga); gp>longitude = xy[0]; gp>latitude = xy[1]; gp>dist = distance; gp>member = member; gp>score = score; return C_OK; } 复制代码
Algorithm summary
Leaving aside many optional parameters, simply summarize how the GEORADIUS command uses geohash to obtain the target location object:
1.Parameter extraction and verification;
2. Use the center point and input radius to calculate the area to be searched. This range of parameters includes the highest geohash grid level (accuracy) that meets the conditions and the corresponding Jiugong grid position that can cover the target area; (will be explained in detail later)
3. Traverse the Jiugong grid, and select the position object according to the range box of each geohash grid. Further find objects that are less than the input radius from the center point and return.
The direct description is not easy to understand. Let us briefly demonstrate the algorithm through the following two pictures:
Let the center of the left image be the search center, the green circular area be the target area, all points be the position objects to be searched, and the red points be the position objects that meet the conditions.
In the actual search, the geohash grid level (the grid size level in the figure on the right) is first calculated according to the search radius, and the position of the nine grid (that is, the position of the red nine grid) is determined. Then the points in the nine grid (blue dot and The distance between the red point and the center point, and finally the points (red points) within the distance range are filtered out.
Analysis of Algorithms
Why use this algorithm strategy for query, or what are the advantages of this strategy, let us analyze and explain in a questionandanswer manner.

Why find the highest geohash grid level that meets the conditions? Why use Jiugongge?
This is actually a problem, and it is essentially a preliminary screening of all element objects. In the multilayer geohash grid, each lowlevel geohash grid is composed of 4 higherlevel grids (as shown in the figure).
In other words, the higher the geohash grid level, the smaller the geographic range covered. When we calculated the highestlevel Jiugong grid (grid) based on the input radius and the position of the center point, which can cover the target area, we have filtered out the extra elements of Jiugong. The reason why the ninegrid grid is used here instead of a single grid is mainly to avoid the boundary situation and narrow the query area as much as possible. Imagine centering on 0 latitudes and longitudes. Even if you check a range of 1 meter, you need to check the entire area of the earth if a single grid covers it. Extending a circle in eight directions can effectively avoid this problem.

How to select an element object through the range box of a geohash grid? How efficient?
First, the geohash values in each geohash grid are continuous and have a fixed range. So as long as you find the ordered objects in the range of position objects. The following is the jump table data structure for ordered collections:It has a query efficiency similar to a binary search tree, and the average operation time complexity is O (log (N)). And all the elements at the bottom are arranged in order in the form of a linked list. Therefore, when querying, as long as the first value in the set is in the target geohash grid, and subsequent comparisons can be performed without searching multiple times. The Jiugong grid cannot be checked together. The reason for traversing one by one is that the geohash values corresponding to the grids of the Jiugong grid are not continuous. The query efficiency will be high only if it is continuous, otherwise many distance calculations are required.
In summary, we have analyzed the detailed process of "GEOADD" and "Check (GEORADIUS)" in the Redis Geo module from the perspective of source code. It can also calculate the function of GEORADIUS to find nearby people in Redis. The time complexity is O (N + log (M)), where N is the number of position elements within a specified radius, and M is calculated by the Nine Palace grid Number of elements. Combined with Redis's own memorybased storage characteristics, it has very high operating efficiency in actual use.
Reference
Redis command reference
geohash
ZSET data structure skiplist in Redis
Original address: https://juejin.im/post/5da40462f265da5baf410a11