Python segmentation fault

Supporting 2.6 and up
sjarvela

Python segmentation fault

Postby sjarvela » Sun Mar 27, 2011 5:13 am

I have a PhidgetSBC with temperature&humidity sensor attached, and I run a python script on another computer to read sensor data.

Previously I had a Ubuntu 8.04 and there was no problems. After a hard drive failure, I moved the script into a Linux Mint 9 Isadora (on a Fit-PC2), where I get segmentation faults all the time.

To rule out other components, I wrote a test script that does nothing else but logs the sensor data. This script keeps crashing just the same, so it seems to be related to Phidget lib.

The python version is 2.6.5, SBC is running firmware 1.0.4.20110322 (minimal) and I've compiled Phidget lib from source version 2.1.8 (20110322).

The segmentation fault sometimes happens immediatelly when the script is started, but sometimes it has to wait hours. But it has never run more than 24 hours.

Usually the last log entry before seg fault is from sensor detached event. I've also enabled phidget internal logging, but there is nothing for the time when this happens.

I also tried python trace, but the last line is always the one waiting for user input (sys.stdin.readline())

What else is there I could do to solve this?

Here's the test script:

Code: Select all

import logging
import os
import sys
from logging.handlers import *

from Phidgets.PhidgetException import PhidgetErrorCodes, PhidgetException
from Phidgets.Events.Events import AttachEventArgs, DetachEventArgs, ErrorEventArgs, InputChangeEventArgs, OutputChangeEventArgs, SensorChangeEventArgs
from Phidgets.Devices.InterfaceKit import InterfaceKit

log = logging.getLogger("SensorTest")

class SensorTest():
   def __init__(self):
      log.debug("Opening interface")
      try:
         self.sensor = InterfaceKit()
      except RuntimeError as e:
         log.error(e)
         self.sensor = None
         return
   
   def start(self):
      try:
         log.debug("Phidget library %s" %(self.sensor.getLibraryVersion()))
         self.sensor.enableLogging(6, "phidget.log")
         self.sensor.setOnAttachHandler(self.onAttached)
         self.sensor.setOnDetachHandler(self.onDetached)
         self.sensor.setOnErrorhandler(self.onError)
         self.sensor.setOnInputChangeHandler(self.onInputChange)
         self.sensor.setOnOutputChangeHandler(self.onOutputChange)
         self.sensor.setOnSensorChangeHandler(self.onChange)
         self.sensor.setOnServerConnectHandler(self.onServerConnected)
         self.sensor.setOnServerDisconnectHandler(self.onServerDisconnected)
      except PhidgetException as e:
         log.error("Phidget Exception %i: %s" % (e.code, e.details))

      log.debug("Opening remote service")
      try:
         self.sensor.openRemote("sbc", 46749)
      except PhidgetException as e:
         log.error("Phidget Exception %i: %s" % (e.code, e.details))
         self.sensor = None
         return

   def onServerConnected(self, e):
      log.debug("Server connected")

   def onServerDisconnected(self, e):
      log.debug("Server disconnected")

   def onAttached(self, e):
      log.debug("Sensor attached: %s (%s) %s" %(self.sensor.getDeviceType(), self.sensor.getDeviceVersion(), self.sensor.getSerialNum()))
      log.debug("Inputs: %d" %(self.sensor.getSensorCount()))

      try:
         self.sensor.setRatiometric(True)

         log.debug("Setting triggers")
         self.sensor.setSensorChangeTrigger(0, 2)
         self.sensor.setSensorChangeTrigger(1, 10)
      except PhidgetException as e:
         log.error("Phidget Exception %i: %s" % (e.code, e.details))
      
   def onDetached(self, e):
      log.debug("Sensor detached")
      
   def onError(self, e):
      log.debug("Phidget Error %i: %s" % (e.eCode, e.description))
      
   def onChange(self, e):
      if (e.index == 0):
         self.onTemperature(e.value)
      elif (e.index == 1):
         self.onHumidity(e.value)

   def onInputChange(self, e):
      pass
   
   def onOutputChange(self, e):
      pass
   
   def onTemperature(self, v):
      t = (float(v) * 0.22222) - 61.11
      log.debug("Temperature %d = %.2f" %(v, t))
         
   def onHumidity(self, v):
      h = (float(v) * 0.1906) - 40.2
      log.debug("Humidity %d = %.2f" %(v, h))
   
   def stop(self):
      if self.sensor: self.sensor.closePhidget()

def initialize_logging():
   logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(name)-18s %(levelname)-8s" "%(message)s", datefmt="%Y-%m-%d %H:%M:%S")
   rotate = RotatingFileHandler(os.path.join(sys.path[0], 'sensor_test.log'), maxBytes = (1 * 1024 * 1024), backupCount=10)
   logging.getLogger('').addHandler(rotate)
   
if __name__ == '__main__':
   initialize_logging()
   log = logging.getLogger('')
   try:
      t = SensorTest()
      t.start()
      try:
         sys.stdin.readline()
      except KeyboardInterrupt:
         t.stop()
   except BaseException, e:
      t.stop()

AdamS

Re: Python segmentation fault

Postby AdamS » Mon Mar 28, 2011 4:13 pm

Out of curiosity, have you tried a different version of Python?

I can't see that being a solution, but would be an interesting test.

I will run some tests here, but it is hard to replicate the situation, especially as you said it worked fine on your previous machine.

What version of the libraries and SBC firmware were you running in previous setup? If you rollback to that version does it work without the segfaults?

sjarvela

Re: Python segmentation fault

Postby sjarvela » Tue Mar 29, 2011 12:50 pm

Sure I could try other python version, as well as going back to old firmware (which I think was 2.1.7.20101103 with the previous machine).

I know this can be hard to solve on your side, I guess I was looking for ideas what could be wrong, or instructions on how to get more info on the segfault.

I used to get python trace with "python -m trace script.py", which did not give anything useful, not to me at least, but then I saw another way of getting trace at the python thread in this forum, which is "strace -o out python script.py". Don't know what's the difference, but now this one has been running longer than ever. But once that crashes, I'll test those other versions (unless it reveals something).

sjarvela

Re: Python segmentation fault

Postby sjarvela » Wed Mar 30, 2011 12:15 am

I got a trace from the strace, but don't know if it proves anything.

Here's what it got:

Code: Select all

poll([{fd=7, events=POLLIN}], 1, 25000) = 1 ([{fd=7, revents=POLLIN}])
read(7, "l\2\1\1\36\0\0\0\3\3\0\0/\0\0\0\6\1s\0\6\0\0\0:1.319\0\0"..., 2048) = 94
write(6, "W", 1)                        = 1
read(7, "l\4\1\1<\0\0\0\4\3\0\0\227\0\0\0\1\1o\0\31\0\0\0/Client2"..., 2048) = 228
read(7, "l\4\1\1\0\0\0\0\5\3\0\0\217\0\0\0\1\1o\0\31\0\0\0/Client2"..., 2048) = 160
read(7, "l\4\1\1\0\0\0\0\6\3\0\0\217\0\0\0\1\1o\0\31\0\0\0/Client2"..., 2048) = 160
read(7, 0x8ee87e8, 2048)                = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1301426991, 24611}, NULL) = 0
write(6, "W", 1)                        = 1
writev(7, [{"l\1\0\1(\0\0\0\n\0\0\0\203\0\0\0\1\1o\0\1\0\0\0/\0\0\0\0\0\0\0"..., 152}, {"\377\377\377\377\377\377\377\377\r\0\0\0_phidget._tcp\0\0\0\0\0\$
gettimeofday({1301426991, 26989}, NULL) = 0
poll([{fd=7, events=POLLIN}], 1, 25000) = 1 ([{fd=7, revents=POLLIN}])
read(7, "l\2\1\1\36\0\0\0\7\3\0\0/\0\0\0\6\1s\0\6\0\0\0:1.319\0\0"..., 2048) = 94
write(6, "W", 1)                        = 1
read(7, "l\4\1\1T\0\0\0\10\3\0\0\227\0\0\0\1\1o\0\31\0\0\0/Client2"..., 2048) = 572
read(7, 0x8ee87e8, 2048)                = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1301426991, 33519}, NULL) = 0
write(6, "W", 1)                        = 1
writev(7, [{"l\1\0\1,\0\0\0\v\0\0\0\203\0\0\0\1\1o\0\1\0\0\0/\0\0\0\0\0\0\0"..., 152}, {"\377\377\377\377\377\377\377\377\21\0\0\0_phidget_sbc._tcp\0\0\0$
gettimeofday({1301426991, 34311}, NULL) = 0
poll([{fd=7, events=POLLIN}], 1, 25000) = 1 ([{fd=7, revents=POLLIN}])
read(7, "l\2\1\1\36\0\0\0\v\3\0\0/\0\0\0\6\1s\0\6\0\0\0:1.319\0\0"..., 2048) = 94
write(6, "W", 1)                        = 1
read(7, 0x8ee87e8, 2048)                = -1 EAGAIN (Resource temporarily unavailable)
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb6f4d000
mprotect(0xb6f4d000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb774d494, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_C$
time(NULL)                              = 1301426991
write(4, "Tue Mar 29 22:29:51 2011,-121571"..., 123) = 123
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0xb674c000
mprotect(0xb674c000, 4096, PROT_NONE)   = 0
clone(child_stack=0xb6f4c494, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_C$
fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 5), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb674b000
read(0,  <unfinished ...>
+++ killed by SIGSEGV +++


Anyway, I thought I'd try compiling the phidget lib without zeroconf and connecting to server with "openRemoteIP". So far it's been running fine, even with repeated executions which did cause segfaults before quite easily.

AdamS

Re: Python segmentation fault

Postby AdamS » Wed Mar 30, 2011 4:20 pm

Thanks for the info, keep me updated.

I will point our lead library developer to this and see what he can gather from this.

Really hard to say what it could be, especially with it being pretty random from the sounds of it.

sjarvela

Re: Python segmentation fault

Postby sjarvela » Thu Apr 07, 2011 3:20 am

Actually, I don't think crashes were random at all. Like I said, they happened after sensor detach/server disconnect event (at least those that I had logs for).

And now I've been running over a week without any problems, so I'm pretty confident it was indeed zeroconf related. I don't mind using ip as it's static for the SBC, so I can live with this.

If you want, I can dig out some more info, if I can. Is there some way I could get more user friendly trace? Let me know if I can help.

AdamS

Re: Python segmentation fault

Postby AdamS » Thu Apr 07, 2011 10:08 am

That is interesting. I know that we had found a few issues with the webservice using zeroconf/bonjour in some situations causing memory issues and work has been done and is currently being done to improve that.

I will have to do some playing to see if I can get this to appear on my end as well and see if we can get some deeper info as well.


Return to “Linux”

Who is online

Users browsing this forum: No registered users and 2 guests