When we go to KXCI 91.3FM Tucson AZ – Real People Real Radio we get the error message An internal server error occurred.
Thanks for reporting it, We’re working on it as fast as we possibly can!!! Hopefully the site will be back very soon.
I apologize for the outage and thanks for your patient.
UPDATE (from @tom ) Mon April 25. I wanted to put a “technical difficulties” graphic near the top of this page but they were all too cute for me given how raw I still feel. So let’s have this photo.
Spinitron rents servers from French provider OVH in their data center near Montreal on which runs the database software that was crashing last week runs. Read about the OVH fire in Strasbourg last year shown here in this Reuters article Millions of websites offline after fire at French cloud services firm | Reuters
UPDATE (from @tom) Thu Apr 28. Sorry if that image I chose on Monday was alarming. That isn’t what happened to our servers. Spinitron is still safely at home in this building.
Which is right next door to a hydo-electric power station in Beauharnois, Quebec, Canada.
I’m afraid it’s all still rather menacing to look at, as large industrial installations often are. So if it helps, here’s a more relaxing image of the inside of a computer.
I assume the fans were not rotating at the time that snap got shot. Right?
The database servers have been crashing. Very mysterious
Apr 20 16:08:32 bhs6 mysqld: realloc(): invalid next size Apr 20 16:08:32 bhs6 mysqld: 220420 16:08:32 [ERROR] mysqld got signal 6 ;
I managed to recover the cluster and the service at least for now.
the cluster is down again. i’m going to try recovering from a recent backup
service is back up for now. i rebooted all the servers in the cluster now. monitoring to see if this bug recurs. if it does we will need to install a difference mariadb server version
no luck. all the db servers crashed again. i’ll attempt downgrading
i’ve restarted one of the nodes, bhs5 without the Galera cluster software, i.e. as a standalone. Spinitron service is available again but i don’t know if it will last. even if it does, this is still just a temporary fix
no joy. same db server crash. trying to load from a recent db backup now. will take an hour or so
backup loaded. service up for now. i don’t expect this to last.
can anyone help me downgrade Debian packages? i’m stuck with aptitude errors like this
mariadb-server-10.4 : Conflicts: mariadb-server (< 1:10.4.24+maria~buster) but 1:10.4.21+maria~buster is to be installed
i downgraded from mariadb-server 10.4.24 to 10.4.21 and i have some hope that this stabilizes things.
there’s more repair to do but i will try to deal with it tomorrow, after giving this latest change some time to see if it helps
damn. that didn’t help. i don’t know what to try next
i’ll keep nursing the currently function server along while i buy and set up new hardware
I just got off the phone with Tom. If anyone questions his commitment to y’all getting this problem fixed, know that he has been up all night nursing the servers and working hard to fix this problem. Tom and Eva have pulled out all the stops to get this strange and obscure problem corrected and make the servers stable as they’ve been for many years.
got a new server. built it up with all new software versions. installed our apps. migrated databases and search engines to it. and now it’s running and seems to be working more or less. we’ll have to wait a few hours to confirm if the same crashes are going to happen.
many apologies, of course, for all the service interruptions.
damn. it’s just gone down again
i’m disabling the metadata push service for the time being to see if that helps
push is back on. turning it off seems not to have helped
thank you so much for the focused attention on these issues. I know you understand this critical connection to stations and their audience. KBCS looks ok right now for how we are implementing the playlist for app and web. I have access on Spinitron for creating playlists.
Please let us know if anything we can do. Again, thank you for your attention to this server crash.
I noticed that all image files I uploaded in the past 18 hours have vanished from the playlists. This is not a complaint, just FYI, but you probably already know this. I will re-upload them. Just some images of Ukrainian album covers.
I’m impressed by your efforts to fix everything, Tom & Eva. You’re not alone: Brainwashed.com has been suffering from server issues for weeks now
sorry about the loss of some images. i was focused on the database and didn’t think to copy those over for a while.
thanks for the words of support. this is a very miserable experience. turning out to be eye-wateringly expensive too. we hired mariadb enterprise support this afternoon. $yikes.
The database server has behaved well for the last 11 hours since I made a configuration change recommended by support staff at MariaDB, the firm that develops the DB software that we use.
Now I will start to put the cluster back into normal operation. You may notice some features of Spinitron not working properly until it is all done (things like searches not finding the most recent spins or new releases). This will take many days.
Apr 20 14:27:32 left coast time we get the first of these crashes. mariadb (the process) makes a fubar move in memory allocation and (I’m guessing now) glibc’s memory allocation software notices this mistake and tells mariadb to abort that move (i.e. libc sends signal 6 to mariadb’s main process). mariadb doesn’t have a way to recover from this kind of error and attempts to shut itself down in the most urgent manner it knows but hangs while trying to write a backtrace. This much looks like this:
realloc(): invalid next size 220421 19:08:13 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 10.5.15-MariaDB-0+deb11u1 key_buffer_size=134217728 read_buffer_size=131072 max_used_connections=11 max_threads=202 thread_count=11 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 575744 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7f9ba8000c58 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7fa478ba2d78 thread_stack 0x49000
Then I manually send a SIGKILL to mariadb and systemd restarts it. mariadb then attempts to recover the data base from the file remains. Initially this was complicated by the procedures of restarting a stuck Galera cluster until I reconfigured to just one server. Then I experimented with different software versions and then new hardware and eventually we got advice from MariaDB (the firm) to try dynamic linking mariadb to jemalloc(*). Then mariadb (the process) stopped crashing.
So much for the good news but for the glass-half-empty types like me (the type I imagine you’d want looking after the IT you rely on) the experience raises a number of questions. What changed some time before Apr 20 14:27:32 that led to this pathological behavior? (I guess some new pattern of requests.) Exactly how did mariadb crash? (The instruction sequence.) What if anything needs to be fixed in the mariadb+glibc configuration? How did mariadb get stuck in its emergency exit requiring manual -9 to restart?
(*) I wonder if the business model of offering a “Community” software version for $0 together with a superior but vaguely defined “Enterprise” version of it that comes with support services for $x might establish a structural incentive to the hoarding of magic tricks like this.