Bug on Hivemind’s following data
utopian-io·@emrebeyler·
0.000 HBDBug on Hivemind’s following data
#### Project Information
* Repository: https://github.com/steemit/hivemind
* Project Name: Hivemind
* Publisher: Steemit inc.
* Related issue at Github: https://github.com/steemit/hivemind/issues/191
#### Problem
Hivemind backed `api.steemit.com` reports invalid/missing following data for some of the accounts. (In comparison to a full node)
#### How to reproduce
1. Query the user `curbot`'s following list. (`condenser_api.get_following`)
```
curl -s --data '{"jsonrpc":"2.0", "method":"condenser_api.get_following", "params":["curbot",null,"blog",100], "id":1}' https://api.steemit.com
```
2. Do the same query on a full node: (https://rpc.usesteem.com)
```
curl -s --data '{"jsonrpc":"2.0", "method":"condenser_api.get_following", "params":["curbot",null,"blog",100], "id":1}' https://rpc.usesteem.com
```
You can see the response is different and incomplete in `api.steemit.com.`.
#### A Python script the detect discrepancies
I believe this is not an exceptional case. I have seen more discrepancies like that while trying to test/benchmark the [tower's new endpoints](https://steemit.com/utopian-io/@emrebeyler/new-version-on-tower-hivemind-rest).
This Python script detects discrepancies on follower lists.
```
from steem import Steem
from steem.account import Account
def get_diff(account):
followers_on_hivemind = Account(
account,
steemd_instance=Steem(
nodes=["https://api.steemit.com"])
).get_followers()
followers_on_full_node = Account(
account,
steemd_instance= Steem(
nodes=["https://rpc.usesteem.com"])
).get_followers()
print(
"Accounts listed on api.steemit.com but not in the rpc.usesteem.com")
print(set(followers_on_hivemind).difference(set(followers_on_full_node)))
print("*" * 42)
print(
"Accounts listed on rpc.usesteem.com but not in the api.steemit.com")
print(set(followers_on_full_node).difference(set(followers_on_hivemind)))
```
***
The result for `@emrebeyler`'s followers:
```
Accounts listed on api.steemit.com but not in the rpc.usesteem.com
set()
******************************************
Accounts listed on rpc.usesteem.com but not in the api.steemit.com
{'hariyati.amin', 'curbot', 'kenzyobiadi', 'erhanbute'}
```
***
After some digging, I have found a rare case on a differently formatted custom json.
For example, I have checked the account history of `curbot` that when he exactly followed my account, and found this transaction:
[Transaction ID: aaccccb73b6dfcb4bbf95f6d2dcb76e1c87137e9](https://steemd.com/b/25992870#aaccccb73b6dfcb4bbf95f6d2dcb76e1c87137e9)
Looks like `curbot` was bundling follow operations into one transaction. And steemd picked up these and registered as valid follow actions.
However, hive's indexer ignores the `custom_json` op if loaded json's length is greater than 2.
https://github.com/steemit/hivemind/blob/f7a467921678d928a0d94928c811442b8ab80bce/hive/indexer/custom_op.py#L55
For this case it's greater than 2 because the format is like:
```
[
['follow', {
'follower': 'curbot',
'following': 'kevinwong',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'nothingismagick',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'simnrodrguez',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'steem-ua',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'decentraland',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'mikepm74',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'empath',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'emrebeyler',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'eroche',
'what': ['blog']
}],
['follow', {
'follower': 'curbot',
'following': 'ervinneb',
'what': ['blog']
}]
]
```
***
This explains `curbot`.
Regarding my other 3 missing followers:
| Follower | Following | Tx ID | Block num | Timestamp |
|---------------|------------|------------------------------------------|-----------|---------------------|
| erhanbute | emrebeyler | d10dcd1bdb661fc4e63f2464fa2262624db5d003 | 26710986 | 2018-10-11T09:55:21 |
| kenzyobiadi | emrebeyler | 9ef235eb36aac5e466b97ad3e459b7eb9495f898 | 26492393 | 2018-10-03T19:38:45 |
| hariyati.amin | emrebeyler | 383a36f7aa65724eb634ebdae141366674dc1df8 | 26450469 | 2018-10-02T08:41:33 |
***
Timestamps suggest that it happened between `2018-10-02` a `2018-10-10`. These transactions don't involve anything unusual.
Additionaly, I have checked `roadscape`'s followers on Steem:
Got this discrepancies:
```
{'curbot', 'kamvreto', 'msutyler'}
```
***
We know the problem w/ `curbot` so I have checked the other accounts.
For the `kamvreto`, they followed `roadscape` at `2016-07-25T22:35:12`.
Here is the account history output:
```
{
'trx_id': '2b7595b1f3e0e0105156d518b83d7eeaa19b6070',
'block': 3514062,
'trx_in_block': 3,
'op_in_trx': 0,
'virtual_op': 0,
'timestamp': '2016-07-25T22:35:12',
'op': ['custom_json', {
'required_auths': [],
'required_posting_auths': ['kamvreto'],
'id': 'follow',
'json': '{"follower":"kamvreto","following":"roadscape","what":["posts","blog"]}'
}]
}
```
***
It was a **legacy** custom_json transaction. The tricky part is that transaction's `what` property includes two elements.
You can see the Follow constructor expects one element:
https://github.com/steemit/hivemind/blob/60dc61ee4bbde2080421a3fdf10c5b83be840e8b/hive/indexer/follow.py#L71
For this reason, Hive also ignores that.
The problem is same with the other missing follower of `roadscape`:
```
{
'trx_id': 'c7694ff17ba7ba3fbe1740f05c2727ecbd98cd62',
'block': 3409232,
'trx_in_block': 1,
'op_in_trx': 0,
'virtual_op': 0,
'timestamp': '2016-07-22T06:18:27',
'op': ['custom_json', {
'required_auths': [],
'required_posting_auths': ['msutyler'],
'id': 'follow',
'json': '{"follower":"msutyler","following":"roadscape","what":["posts","blog"]}'
}]
}
```
***
Expanding the sample size:
Discrepancies on `@utopian-io`'s followers:
```
Accounts listed on rpc.usesteem.com but not in the api.steemit.com
{'qawazd', 'steemgems', 'curbot'}
```
***
| Follower | Following | Tx ID | Block num | Timestamp |
|-----------|------------|------------------------------------------|-----------|---------------------|
| steemgems | utopian-io | 25e9c3d8e625e634b68bd5e16e99327fd37174ae | 26722368 | 2018-10-11T19:25:27 |
| qawazd | utopian-io | 8de43899a8ad84b8bd65a896e71e3e0eafda0757 | 26838941 | 2018-10-15T20:37:51 |
***
Follow operations are valid. Dates are close to what we miss at @emrebeyler's account: `2018-10-11` and `2018-10-15`.
#### TL;DR
- We have missing follow ops on api.steemit.com's hive instance. (Generally clustered around the month `2018-10`.)
- Hive ignores if the follow operation includes multiple follows. (steemd accepts it. The case with the @curbot)
- Hive ignores some legacy follow operations. Because, these ops may include two elements in the `what` property. (Ex: `["posts", "blog"]`)
#### My GitHub Account
https://github.com/emre👍 ahmeterbay, raoul.poenar, bulent1976, bitcoinator, steeming-hot, victorcovrig, swaze, frankdanger, eforucom, evilest-fiend, muratkbesiroglu, bilimkurgu, field, jacekw.dev, steemituplife, hamsa.quality, mattockfs, alitavirgen, luna777, curation.trail, leir, lionsuit, mercadosaway, merlin7, bluesniper, nieloagranca, mariac2601, alyssah2tp3green, nkilehisli, statsexpert, esiselac1980, sophia96, momarsijit, nicole5lw, palaceterc, itypstylgesch, campmolmabe, ella5u, careafusli, jacekw, akifane, tahirozgen, amosbastian, steemtank, literaturk, neokuduk, codingdefined, rufans, bukiland, espoem, jaff8, asaj, fego, xrp.trail, ascorphat, rasit, maveraunnehr, crokkon, emrebeyler, ulockblock, mops2e, mcanimation, sbi3, tony.montana, elchin, pibara, ydavgonzalez, sereze, beemtutorials, inertia, khairulmuammar, isnochys, gokos, aydant, tinowhale, onursa, chorock, samedb, tdogvoid, zcool, ali.yuce, portugalcoin, zephalexia, gentmartin, jumbot, criptoanarquista, priyanarc, abysoyjoy, ijark, skymin, mcfarhat, toninux, smafey, jacobkaled, maaz23, rightscomet, nudgent, enjoyy, lukecreed, fromhell2sky, emirfirlar, evansbankx, gjones15, nagaclub, ruh, carment, daszod, rechellomataro, kryptorero, bahagia9, gydronium, sbd-fairy, hayirhah, coinmeria, ucmuharfli, embesilikat, luisal314, gulumserunver, isisfemale, rainbowlord, kemalyokus, firster78, canku, alisari, basav, sutter, veronicacoli, fabielblanchard, daddywilliam, arequipa, leviackerman, lujuria, agememnon, ahmetchef, mozer, bboyabluka, varolleng, tolgahanuzun, pablorg94, cooperfelix, unforgettable, faithvarron, intelligencer, parakazan, hahajin, berkerpeksag, firatozbek, lastozgur, sauronbey, yollardannotlar, sjomeath, tipitip, sulwati, seanlloyd, googletr, ruel.cedeno, forkonti, leticiapereira, ahmetmertugrul, ceruleanblue, alvinvoo, mrmaracucho, hyroniz, hellowhale, turkolog, ataturk, giftbox, decebal2dac, senseofhumor, holger80, uzerebru, whiterabb1t, tentalavera, blaqboyikott, tugbaerdem, steeman220, uzercanan, apshamilton, hamismsf, bluerobo, blockchainstudio, jakipatryk, smjn, greenorange, yury-vas, jrawsthorne, themadcurator, berniesanders, thecyclist, ngc, j85063, oguzcan, shredz7, steven-patrick, coingecko, tdre, witnessbot, josephsavage, steemchiller, mids106, craigahamilton, we-are-asia, geekgirl, audextovar, coskunsoysal, sndbox-alpha, shortcut, heymattsokol, ejemai, erb, eurogee, reddragonfly, howtostartablog, mintvilla, leotrap, steemitph, markjason, penauthor, alvinauh, jeffbernst, curazao, crypto3d, kaeo, medical-hall, ibez, blockmountain, recordpool, luvabi, faluthi01, kofspades, anna-mi, critic-on, owaishassankhan, jonnahmatias1016, imnotasenuelo, tailslide, aimei, sanctuspierre, juned0292, wr212, jickirti, zam398, world-travel-pro, leesongyi, smaeunabs, sireh, paragon99, gky, bitrocker2020, sndbox, crazyluv, ninjavideo, boyacun, twotoedsloth, toheliuk, spectrums, rinbird, hansikhouse, voronoi, playitforward, etherpunk, berkaytekinsen, flashfiction, deejee, debruyne844, tanyaschutte, sagor5828, camillius, beladro, bit6in, riandifc, steemit-uruguay, sirwayneweezy, dayoung, imaginedragon, aidnessanchez, adamzi, bishoppeter1, warpedpoetic, ameliabartlett, thecolaguy, sarez, cafelate, penyuteverest, raghao, aljofer, henryconache, sumomo, kelicimchi, writeandearn, em3di, eightbitfiction, anak123, cryptastic, bearone, carrotcake, diyanti86, iansart, digitaldreamer, coloringiship, klynic, somethingburger, steemeat, mrblinddraw, animagic, kenan1989, kemal13, lifediaries2nd, carloniere, paolobeneforti, silviabeneforti, teachblogger, affiedalfayed, scuzzy, nairadaddy, asyrafahamed, apteacher, carpet.duck, megaraz, asbear, sigmund, msjito, videografist, v4vapid, sudutpandang, pennsif, we-are, firatuz, sachincool, steem-ua, jjay, lordneroo, ryuna.siege, louis88, fyn, themarkymark, mandarin2016, tensor, bejust, feronio, progressing, utopian-io, tombstone, jga, fandy,