今天测试上位机发一个较长的ros消息,硬件不响应。
常在河边走,哪能不湿鞋?
硬件的条件代码竟然又误写成了这样:
if(a=0){
//...
}
=与== 看不清,真是要命,又给我有限的生命造成了浪费!!
以为找到了真凶,出乎意料,问题依旧。
反复测试吧,一定会找到症结所在:
1、
ros消息改短有效,
加长不响应,
2、
难道传输干扰,串口传输过程出错?
改用短的屏蔽usb连接线,问题依旧。
3、
更改数据长度找个临界点,
发现不同数据临界点不固定。
4、
随机数据,
时好时坏。
。。。
百般无奈,不得不翻翻库的源码了。
怀疑是校验部分不过。
校验算法很简单,
https://blog.csdn.net/qq_38288618/article/details/102931684
Message Length Checksum = 255 - ((Message Length High Byte +
Message Length Low Byte) % 256 )Message Data Checksum = 255 - ((Topic ID Low Byte +
Topic ID High Byte +
Data byte values) % 256)
即长度的两个字节及所有数据字节求和,求余 256 ,是哪里出错呢
node_handle.h
接收包数据相关源码
virtual int spinOnce()
{
/* restart if timed out */
uint32_t c_time = hardware_.time();
if ((c_time - last_sync_receive_time) > (SYNC_SECONDS * 2200))
{
configured_ = false;
}
/* reset if message has timed out */
if (mode_ != MODE_FIRST_FF)
{
if (c_time > last_msg_timeout_time)
{
mode_ = MODE_FIRST_FF;
}
}
/* while available buffer, read data */
while (true)
{
// If a timeout has been specified, check how long spinOnce has been running.
if (spin_timeout_ > 0)
{
// If the maximum processing timeout has been exceeded, exit with error.
// The next spinOnce can continue where it left off, or optionally
// based on the application in use, the hardware buffer could be flushed
// and start fresh.
if ((hardware_.time() - c_time) > spin_timeout_)
{
// Exit the spin, processing timeout exceeded.
return SPIN_TIMEOUT;
}
}
int data = hardware_.read();
if (data < 0)
break;
checksum_ += data;
if (mode_ == MODE_MESSAGE) /* message data being recieved */
{
message_in[index_++] = data;
bytes_--;
if (bytes_ == 0) /* is message complete? if so, checksum */
mode_ = MODE_MSG_CHECKSUM;
}
else if (mode_ == MODE_FIRST_FF)
{
if (data == 0xff)
{
mode_++;
last_msg_timeout_time = c_time + SERIAL_MSG_TIMEOUT;
}
else if (hardware_.time() - c_time > (SYNC_SECONDS * 1000))
{
/* We have been stuck in spinOnce too long, return error */
configured_ = false;
return SPIN_TIMEOUT;
}
}
else if (mode_ == MODE_PROTOCOL_VER)
{
if (data == PROTOCOL_VER)
{
mode_++;
}
else
{
mode_ = MODE_FIRST_FF;
if (configured_ == false)
requestSyncTime(); /* send a msg back showing our protocol version */
}
}
else if (mode_ == MODE_SIZE_L) /* bottom half of message size */
{
bytes_ = data;
index_ = 0;
mode_++;
checksum_ = data; /* first byte for calculating size checksum */
}
else if (mode_ == MODE_SIZE_H) /* top half of message size */
{
bytes_ += data << 8;
mode_++;
if(bytes_>=sizeof(message_in)){mode_ = MODE_FIRST_FF;}/*zzzzzzzzzzzzz-----------*/
}
else if (mode_ == MODE_SIZE_CHECKSUM)
{
if ((checksum_ % 256) == 255)
mode_++;
else
mode_ = MODE_FIRST_FF; /* Abandon the frame if the msg len is wrong */
}
else if (mode_ == MODE_TOPIC_L) /* bottom half of topic id */
{
topic_ = data;
mode_++;
checksum_ = data; /* first byte included in checksum */
}
else if (mode_ == MODE_TOPIC_H) /* top half of topic id */
{
topic_ += data << 8;
mode_ = MODE_MESSAGE;
if (bytes_ == 0)
mode_ = MODE_MSG_CHECKSUM;
}
else if (mode_ == MODE_MSG_CHECKSUM) /* do checksum */
{
mode_ = MODE_FIRST_FF;
if ((checksum_ % 256) == 255)
{
if (topic_ == TopicInfo::ID_PUBLISHER)
{
requestSyncTime();
negotiateTopics();
last_sync_time = c_time;
last_sync_receive_time = c_time;
return SPIN_ERR;
}
else if (topic_ == TopicInfo::ID_TIME)
{
syncTime(message_in);
}
else if (topic_ == TopicInfo::ID_PARAMETER_REQUEST)
{
req_param_resp.deserialize(message_in);
param_recieved = true;
}
else if (topic_ == TopicInfo::ID_TX_STOP)
{
configured_ = false;
}
else
{
if (subscribers[topic_ - 100])
subscribers[topic_ - 100]->callback(message_in);
}
}else{
//zzz test--------------------------------------------------------------这里测试,不进行校验
//delay(1000);
//if (subscribers[topic_ - 100])
// subscribers[topic_ - 100]->callback(message_in);
}
}
}
/* occasionally sync time */
if (configured_ && ((c_time - last_sync_time) > (SYNC_SECONDS * 500)))
{
requestSyncTime();
last_sync_time = c_time;
}
return SPIN_OK;
}
观摩代码,米有问题,
动动手术,更改代码,跳过校验,回传数据对比也没有出错。
到底问题出在哪里呢?
仔细观察,发现存储求和用的数据类型是 int。原来大神也大意了
line 197
int checksum_;
问题就在这里了。
int 类型取值范围有正有负,
作为验证求和,不断累加势必会变为负数。
负数求余和正数求余是不同的。
参考百度解释
https://baike.baidu.com/item/%E5%8F%96%E6%A8%A1%E8%BF%90%E7%AE%97/10739384?share_fr=pc_sina
我提交rosserial库的这个错误详见github
communication problem #474
https://github.com/ros-drivers/rosserial/pull/474
checksum_ type error
line 197 int checksum_; //There must be some Errors caused by modular
operation of negative number.zzz.… } else if (mode_ == MODE_MSG_CHECKSUM) /* do checksum / { mode_ =
MODE_FIRST_FF; if ((checksum_ % 256) == 255)
/*--------------------------this place------------------------------
*/ { …checksum_ keeps accumulating and becomes negative. Communication
failure when this condition is not met!
其实代码中有些不显眼的细节往往会疏忽大意,大神也躲不过。
又废了一天的生命,终于测出是官方的一个错误。
人非圣贤孰能无过?
好吧,必须原谅自己与前辈。
希望花费我的1天,能节省1000个道友的1天!