Rocking with Steem-Python

View this thread on: d.buzz | hive.blog | peakd.com | ecency.com
·@danielsaori·
0.000 HBD
Rocking with Steem-Python
<html>
<center>
<h1>Making Steem-Python more reliable</h1><br />
<div class="pull-right"><img src="https://i.imgur.com/C9JVX1K.png" alt="dorabot" /><sub>Python code in action...</sub></div></center>
 
<p>
The official RPC nodes have been under heavy load recently and because of that, it has been beneficial to swap nodes completely or at least implement some kind of failover mechanism.
</p>
<p>
I have now started to use a mix of the following nodes:<br>
<i>https://gtg.steem.house:8090<br>
https://steemd.minnowsupportproject.org<br>
https://rpc.steemliberator.com<br>
https://steemd.privex.io<br>
https://steemd.steemit.com<br></i>
</p>
<p>
Although it has improved response times and reliability, it has not been perfect. I have had especially one big problem. Please read along to understand the issue, how I fixed it, and how I proposed a change to the steem-python library. 
</p>
 
<p>
<center><img src="https://i.imgur.com/Lq3ISjy.png" /></center>
</p>
<p>
Much of my coding has been to get my bot, @dorabot, going, and it has moved along pretty smoothly. The issue I ran into though, was the ability to reliably stream data from the blockchain to be able to pick up comments sent to the bot. I have for example a function that can pick a random winner from all users who upvoted a post or a comment. This is triggered by making a reply and including the two words, "@dorabot" and  "?winner". As you see below any other text can be included as well.</p>
 
<center>
<p>https://steemitimages.com/DQmWBgeefvq7rRCtemCod3Adp1ML4mFhshQsZFTEfjP5era/image.png</p>
</center> 

<p>
There are a number of different functions part of the of the steem-python library to stream comments or fetch blocks from the blockchain, all with some different features or additional processing. But in the background, many of them are using the API call: <code>get_block</code> to actually fetch the data.</p>
 
<p>
Although I had started to use several different RPC nodes and implemented failover in case of errors, I regularly ran into issues with the "get_block" API call. In this implementation I used the "get_blocks()" function part of Steemd.</p>
 
<p>
The issue was not with the API call itself, but rather a combination of how steem-python is handling API calls and how the "get_blocks()" function is treating the return data. Steem-python is using a library called "urllib3" to execute all API calls. This is handled in "http_client.py", part of "steembase".</p>
 
<p>
The code in "http_client.py" has its own error checking, it will failover to a redundant node and hold off further requests etc.
, to give the server a chance to recover from temporary issues. But the issue I found was with responses where the server actually replied, but returned an error code, a non 200 code, like a 403 or 504 error.</p>
 
<p>
The code below is a snippet of the exec() function, part of http_client.py. In case an HTTP error, like a 403 code, is received from the server, nothing is really done. The response will be empty and will be returned to the calling function. 
</p>

<p> 
<pre><code>
if response.status not in tuple([*response.REDIRECT_STATUSES, 200]):
    logger.info('non 200 response:%s', response.status)
 
return self._return(
    response=response,
    args=args,
    return_with_args=return_with_args)
</code></pre>
<center><sub>(<a href="https://github.com/steemit/steem-python/blob/master/steembase/http_client.py" rel="nofollow noopener">from http_client.py - end of exec() function</a>).</sub></center>
</p>
 
<p>
An empty response by itself is nothing bad, as long as it is dealt with properly, but the "get_blocks" function part of Steemd has nothing to act upon when an empty response is received. No error will be thrown, instead, it will rerun exactly the same request. So if one RPC node is broken and constantly returns a 403 error, we will loop to infinity...<br>
Below the code showing this loop.
</p>

<p>
<pre><code>
while missing:
    for block in self._get_blocks(missing):
        blocks[block['block_num']] = block

        available = set(blocks.keys())
        missing = required - available
</code></pre>
<center><sub>(<a href="https://github.com/steemit/steem-python/blob/master/steem/steemd.py" rel="nofollow noopener">from steemd.py - part of the get_blocks() function</a>).</sub></center>
</p>

<p>
With the simple few lines in the following snippet, you can test this behaviour on your own. Run this from the python command line. <br>
I supply two nodes below, the first one is a non-existing link to GitHub which will always return a 404 error. DING! Perfect for this test. :) I'm using "steem.hostname" to check the active node. As you can see, I have to cancel the loop as it otherwise would keep on running forever.
</p>

<p>
<pre><code>
>>> from steem import Steem
>>> my_nodes = ['https://github.com/logddin', 'https://steemd.minnowsupportproject.org']
>>> steem = Steem(my_nodes)
>>> blocks = steem.get_blocks_range(16269929,16269930)
... (###output truncated)
^CNon 200: github.com (###stopped the loop with Ctrl+C)
.... (###output truncated)
>>> steem.hostname
'github.com'
>>>
</code></pre>
</p>

<p>
Below an example of using the get_account() function. Here we don't risk to get stuck in an infinite loop, but instead, the empty response will cause the function to immediately throw the error as seen below. If you have used the steem-python library you should have seen a number of these. This is not really an issue on its own as some simple error checking with "try:/expect:" statements will easily detect it.
</p>

<p>
<pre><code>
>>> steem.get_account('dorabot')
Traceback (most recent call last):
... (###output truncated)
TypeError: 'NoneType' object is not iterable
>>>
</code></pre>
</p>

<center>
<p>
https://steemitimages.com/DQmVmQtuL4Yih8odD6i6A2vHM4tAtw6DoX6AACF4NwaH5pv/fixit.png
</p>
</center>

<p>
The code below shows the modifications I have done to the code in the exec() function part of http_client.py. I copied the logic from the other error handling done earlier in the exec() function. The original code also had a bug, where the second if-statement below was written as an elif-statement and hence would never be executed. That bug would cause another loop to infinity condition in case all the RPC nodes would malfunction.
</p>

<p>
<pre><code>
if response.status not in tuple([*response.REDIRECT_STATUSES, 200]):
	logger.info('non 200 response:%s', response.status)
	# try switching nodes before giving up ### Added
	if _ret_cnt > 2: ### Added
		time.sleep(5 * _ret_cnt)  ### Added
	if _ret_cnt > 10: ### Added
		return self._return(response=response.status) ### Added
	self.next_node() ### Added
	return self.exec(name, *args, return_with_args=return_with_args, _ret_cnt=_ret_cnt + 1) ### Added

return self._return(
	response=response,
	args=args,
	return_with_args=return_with_args)
</code></pre>
<center><sub>(<a href="https://github.com/steemit/steem-python/blob/master/steembase/http_client.py" rel="nofollow noopener">from http_client.py - with my modifications</a>).</sub></center>
</p>

<p>
With these modifications, we can re-run the same test done above with the get_blocks_range() function. 
</p>
<p>
The script starts by sending the request to github.com, but as a 404 error is received, it quickly fails over to steemd.minnowsupportproject.org.<br> 
(Thanks to @followbtcnews & @crimsonclad for hosting this RPC node as part of MSP!!!)
</p>

<p>
<pre><code>
>>> from steem import Steem
>>> my_nodes = ['https://github.com/logddin', 'https://steemd.minnowsupportproject.org']
>>> steem = Steem(my_nodes)
>>> steem.hostname
'github.com'
>>> blocks = steem.get_blocks_range(16269929,16269930)
Non 200: github.com ### Print statement added for troubleshooting
>>> steem.hostname
'steemd.minnowsupportproject.org'
>>>
</code></pre>
</p>

<p>
Since implementing this I have had no issues anymore with streaming blocks from the blockchain. And as a bonus, as less empty responses are returned, I have seen in my logs that I deal with way fewer exceptions.</p>

<p>
I have submitted two pull requests on GitHub. Let's see what reviewers will say and if something is wrong with my logic.<br>
https://github.com/steemit/steem-python/pulls
</p>

<center>
<h2>Thank you for reading!<br />Stayed tune for future updates.</h2>
<h3>Please let me know if you have any questions.<br /> And please ping me (@danielsaori) if you connect to Discord.</h3>
</center>
 
<center><img src="https://steemitimages.com/0x0/https://steemitimages.com/0x0/https://steemitimages.com/0x0/https://steemitimages.com/DQmRSmRyg4MdRdiKsWTMbfyiAG673K1yP65MoUTbCXGp9Xi/palfoot.gif" /></center>
 
<center>
<p>Click <a href="https://discordapp.com/invite/E4t4efP" rel="nofollow noopener"><b>HERE</b></a> to connect to MSP's Discord server.</p>
<p><br /></p>
<p><img src="https://steemitimages.com/0x0/https://steemitimages.com/0x0/https://media.giphy.com/media/xUPGcmv20b7T9cHWzm/giphy.gif" /></p></center>
</html>
👍 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,