digitalmars.D - Forums intermittently going down?
- Michael (12/12) Sep 17 2018 It has been occurring for the past two weeks now, at least. When
- Vladimir Panteleev (4/15) Sep 17 2018 Not just you. The server is just overloaded.
- Paolo Invernizzi (4/21) Sep 17 2018 I can confirm that in the last two weeks the overload is very
- Michael (4/21) Sep 17 2018 Okay great, as long as it's just not me and you guys are aware.
- Petar Kirov [ZombineDev] (4/7) Sep 17 2018 How feasible would be to have a simple page like
- Vladimir Panteleev (6/14) Sep 17 2018 Just ask me on IRC :)
- Vladimir Panteleev (3/5) Sep 20 2018 Performance should now be back to normal.
- PassingBy (15/20) Sep 25 2018 forum.dlang.org is temporarily down for maintenance, and should
- Vladimir Panteleev (4/5) Sep 25 2018 Looks like my previous hunch as to why it was slow was off.
- CharlesM (3/4) Sep 25 2018 Yeah it happened again today. I heard this site was made in D,
- Vladimir Panteleev (2/4) Sep 25 2018 No, just old server hardware and database fragmentation.
- H. S. Teoh (5/10) Sep 25 2018 Wow, that's GC-phobia like I've never seen before!
- Steven Schveighoffer (12/20) Sep 25 2018 Well, I thought it might be GC related also. It behaves similarly to how...
- Vladimir Panteleev (12/16) Sep 25 2018 I'm no DBA. Here's the schema:
- bachmeier (5/13) Sep 25 2018 How much data can there possibly be for a mailing list? I
- Vladimir Panteleev (12/15) Sep 25 2018 Currently, 3.8 GB.
- H. S. Teoh (24/30) Sep 25 2018 [...]
- CharlesM (2/4) Sep 25 2018 https://github.com/CyberShadow/DFeed/blob/master/schema.sql
- Vladimir Panteleev (24/34) Sep 25 2018 Yep, well, it's not very good at it (as it wasn't designed for
- H. S. Teoh (28/49) Sep 26 2018 [...]
- JN (5/9) Oct 04 2018 Seems like the issues with the forum got worse. It's hardly
- Vladimir Panteleev (6/9) Oct 04 2018 Yeah, painfully aware. I've been trying a bunch of different
- H. S. Teoh (7/17) Oct 05 2018 Maybe it's time for the Foundation to fund dedicated hardware for
- AlCaponeJr (4/21) Oct 05 2018 Why not use some Cloud Services instead of worrying about
- H. S. Teoh (12/25) Oct 05 2018 [...]
- Jacob Carlborg (8/10) Oct 06 2018 And you think that cannot happen when you're managing the hardware
- Jacob Carlborg (15/17) Oct 06 2018 How is having single server (I assume) behind a single ISP any less
- Basile B (3/5) Oct 06 2018 I've checked the logs from my stuff at Semaphore and it's since
- CharlesM (17/18) Sep 25 2018 https://github.com/CyberShadow/DFeed/blob/master/schema.sql
- Vladimir Panteleev (7/17) Sep 25 2018 Yep, this is mostly descriptive. Types in column declarations
- CharlesM (10/18) Sep 25 2018 I don't remember where I read, but it's because the type
- Abdulhaq (6/7) Sep 25 2018 SQLite was designed initially to be single local process, one
- Vladimir Panteleev (5/8) Sep 25 2018 I think that would be plausible if parts of the managed heap were
It has been occurring for the past two weeks now, at least. When I try to load the forum (on different networks) it will often hang for a while, and when it does eventually load a page, it is likely that clicking a link will cause it to get stuck loading again, or eventually display the following message: forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. Is anyone else experiencing this? I thought it might just be me but it seems to be happening across browsers and on different networks. Thanks, Michael.
Sep 17 2018
On Monday, 17 September 2018 at 11:02:39 UTC, Michael wrote:It has been occurring for the past two weeks now, at least. When I try to load the forum (on different networks) it will often hang for a while, and when it does eventually load a page, it is likely that clicking a link will cause it to get stuck loading again, or eventually display the following message: forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. Is anyone else experiencing this? I thought it might just be me but it seems to be happening across browsers and on different networks.Not just you. The server is just overloaded. The high load is temporary, but will take a week or two to resolve.
Sep 17 2018
On Monday, 17 September 2018 at 11:51:04 UTC, Vladimir Panteleev wrote:On Monday, 17 September 2018 at 11:02:39 UTC, Michael wrote:I can confirm that in the last two weeks the overload is very frequent... almost one over five click.It has been occurring for the past two weeks now, at least. When I try to load the forum (on different networks) it will often hang for a while, and when it does eventually load a page, it is likely that clicking a link will cause it to get stuck loading again, or eventually display the following message: forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. Is anyone else experiencing this? I thought it might just be me but it seems to be happening across browsers and on different networks.Not just you. The server is just overloaded. The high load is temporary, but will take a week or two to resolve.
Sep 17 2018
On Monday, 17 September 2018 at 11:51:04 UTC, Vladimir Panteleev wrote:On Monday, 17 September 2018 at 11:02:39 UTC, Michael wrote:Okay great, as long as it's just not me and you guys are aware. Thanks for letting me (us) know.It has been occurring for the past two weeks now, at least. When I try to load the forum (on different networks) it will often hang for a while, and when it does eventually load a page, it is likely that clicking a link will cause it to get stuck loading again, or eventually display the following message: forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. Is anyone else experiencing this? I thought it might just be me but it seems to be happening across browsers and on different networks.Not just you. The server is just overloaded. The high load is temporary, but will take a week or two to resolve.
Sep 17 2018
On Monday, 17 September 2018 at 11:51:04 UTC, Vladimir Panteleev wrote:[..] The high load is temporary, but will take a week or two to resolve.How feasible would be to have a simple page like https://status.github.com/ for sharing such information?
Sep 17 2018
On Monday, 17 September 2018 at 16:51:42 UTC, Petar Kirov [ZombineDev] wrote:On Monday, 17 September 2018 at 11:51:04 UTC, Vladimir Panteleev wrote:Just ask me on IRC :) Maintaining such a status page would take sufficient time/effort that it would be better spent on the problem itself. (But if someone volunteers, why not...)[..] The high load is temporary, but will take a week or two to resolve.How feasible would be to have a simple page like https://status.github.com/ for sharing such information?
Sep 17 2018
On Monday, 17 September 2018 at 11:51:04 UTC, Vladimir Panteleev wrote:The high load is temporary, but will take a week or two to resolve.Performance should now be back to normal.
Sep 20 2018
On Friday, 21 September 2018 at 00:57:42 UTC, Vladimir Panteleev wrote:On Monday, 17 September 2018 at 11:51:04 UTC, Vladimir Panteleev wrote:forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. forum.dlang.org is temporarily down for maintenance, and should be back up shortly. Apologies for the inconvenience. Geeeehhhh! For a TEXT ONLY newsgroup proxy... come one guys!The high load is temporary, but will take a week or two to resolve.Performance should now be back to normal.
Sep 25 2018
On Friday, 21 September 2018 at 00:57:42 UTC, Vladimir Panteleev wrote:Performance should now be back to normal.Looks like my previous hunch as to why it was slow was off. Should be fixed now.
Sep 25 2018
On Monday, 17 September 2018 at 11:02:39 UTC, Michael wrote:...Yeah it happened again today. I heard this site was made in D, maybe is because the GC?
Sep 25 2018
On Tuesday, 25 September 2018 at 18:26:58 UTC, CharlesM wrote:Yeah it happened again today. I heard this site was made in D, maybe is because the GC?No, just old server hardware and database fragmentation.
Sep 25 2018
On Tue, Sep 25, 2018 at 08:41:51PM +0000, Vladimir Panteleev via Digitalmars-d wrote:On Tuesday, 25 September 2018 at 18:26:58 UTC, CharlesM wrote:Wow, that's GC-phobia like I've never seen before! T -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? -- Michael BeiblYeah it happened again today. I heard this site was made in D, maybe is because the GC?No, just old server hardware and database fragmentation.
Sep 25 2018
On 9/25/18 5:05 PM, H. S. Teoh wrote:On Tue, Sep 25, 2018 at 08:41:51PM +0000, Vladimir Panteleev via Digitalmars-d wrote:Well, I thought it might be GC related also. It behaves similarly to how you would expect a GC pause to behave (several fast responses, then one that takes 5 seconds to come back). But lately, I've noticed I just get the "down for maintenance" message more than a delayed response. In any case, I generally don't use the forum except read-only mode on my phone. For posting, I'm generally using NNTP. I'll note that when I started running into DB slowdowns on a system (not related to D), adding one index fixed the issue. Sometimes linear searches are fast enough to hide in plain sight :) -SteveOn Tuesday, 25 September 2018 at 18:26:58 UTC, CharlesM wrote:Wow, that's GC-phobia like I've never seen before!Yeah it happened again today. I heard this site was made in D, maybe is because the GC?No, just old server hardware and database fragmentation.
Sep 25 2018
On Tuesday, 25 September 2018 at 21:12:54 UTC, Steven Schveighoffer wrote:I'll note that when I started running into DB slowdowns on a system (not related to D), adding one index fixed the issue. Sometimes linear searches are fast enough to hide in plain sight :)I'm no DBA. Here's the schema: https://github.com/CyberShadow/DFeed/blob/master/schema.sql Sometimes the database (SQLite) behaves unusually slow until you tell it to analyze itself, then it figures out some internal index it has to use that it wasn't using before (with no changes to schema). Those analysis runs take a long time to run, though, during which the database is offline. More generally, though, a big factor is probably that the size of the data set is exceeding the intended use cases for the database software used.
Sep 25 2018
On Tuesday, 25 September 2018 at 21:20:29 UTC, Vladimir Panteleev wrote:Sometimes the database (SQLite) behaves unusually slow until you tell it to analyze itself, then it figures out some internal index it has to use that it wasn't using before (with no changes to schema). Those analysis runs take a long time to run, though, during which the database is offline. More generally, though, a big factor is probably that the size of the data set is exceeding the intended use cases for the database software used.How much data can there possibly be for a mailing list? I regularly see stories about companies using SQLite for databases in the hundreds of GB.
Sep 25 2018
On Tuesday, 25 September 2018 at 21:42:40 UTC, bachmeier wrote:How much data can there possibly be for a mailing list?Currently, 3.8 GB. A good part of that is the full-text index required for searching. (It does work really well, though - no need for Lucene or such.)I regularly see stories about companies using SQLite for databases in the hundreds of GB.One thing possible with a traditional RDBMS that's not possible with SQLite is processing several simultaneous requests. The synchronous API translates to the synchronous nature of the entire program: when the forum hits a request it needs a few seconds to handle, it can't process any requests during that time, even those it could answer without consulting the database (as much is cached in RAM).
Sep 25 2018
On Wed, Sep 26, 2018 at 01:07:29AM +0000, Vladimir Panteleev via Digitalmars-d wrote: [...]One thing possible with a traditional RDBMS that's not possible with SQLite is processing several simultaneous requests. The synchronous API translates to the synchronous nature of the entire program: when the forum hits a request it needs a few seconds to handle, it can't process any requests during that time, even those it could answer without consulting the database (as much is cached in RAM).[...] What version of SQLite are you using? AFAIK, SQLite itself does support concurrent access. Though it does have to be explicitly compiled with that option, otherwise it will only issue a runtime error. Of course, locking is not as fine-grained, so if one request locks one table then it will block everything else. IME, though, SQLite performance can be greatly improved simply by indexing columns used for lookup. Except for row ID, SQLite doesn't index by default, so if you're filtering your selects by other columns, you're potentially hitting O(n) table scans per lookup. I don't know what your schema looks like, so it's hard to give specifics, but basically, any column used in a WHERE clause is a candidate for indexing. Of course, it's a judgment call which columns are best for indexing -- you don't want to index everything since the overhead might make it even slower than without indexing. You might have to play around a bit to find the best candidates to index. Usually, though, as is typical for performance optimizations, there's just a small number of columns that are causing a bottleneck; once they are indexed, it should yield much improved performance. T -- Fact is stranger than fiction.
Sep 25 2018
On Wednesday, 26 September 2018 at 01:52:31 UTC, H. S. Teoh wrote:I don't know what your schema looks like, so it's hard to give specificshttps://github.com/CyberShadow/DFeed/blob/master/schema.sql
Sep 25 2018
On Wednesday, 26 September 2018 at 01:52:31 UTC, H. S. Teoh wrote:What version of SQLite are you using? AFAIK, SQLite itself does support concurrent access. Though it does have to be explicitly compiled with that option, otherwise it will only issue a runtime error. Of course, locking is not as fine-grained, so if one request locks one table then it will block everything else.Yep, well, it's not very good at it (as it wasn't designed for it). It locks the entire database when writing, and when the lock is held, you get an exception or have to retry on a timer. So, it's "supported" but not actually scalable.I don't know what your schema looks like, so it's hard to give specifics,I posted a link to the schema earlier in the thread.but basically, any column used in a WHERE clause is a candidate for indexing.Yep, I think we're past that already. The last issue I ran into was subscriptions. Some people seem to be creating subscriptions to collect and email them frequently, sometimes on every post - not that those work well, because the forum stops emailing people as soon as they have unread messages in their subscriptions, but they still get saved to the queue. Still, the longer the forum was online, the more subscriptions have accumulated, and every new post resulted in all those subscriptions getting triggered. Now, every time a subscription with an email action was triggered, we had to check if there are any unread messages in their subscription queue, and there can be a lot of messages in there - thus, this caused something like an O(m*n) database operation (with the underlying database implementation also not having a constant execution time of course). I fixed this by limiting the check to the first unread post instead of reusing a function to count all unread messages in the subscription queue: https://github.com/cybershadow/DFeed/commit/9cfcab2
Sep 25 2018
On Wed, Sep 26, 2018 at 02:33:27AM +0000, Vladimir Panteleev via Digitalmars-d wrote:On Wednesday, 26 September 2018 at 01:52:31 UTC, H. S. Teoh wrote:[...][...] Hmm. I wonder if it might help if you separated the subscription queue into its own database. You're right that SQLite locks the entire database when writing, so if there's a lot of write activity going on, readers will be frequently blocked. Separating part of the data into its own DB may help increase the parallelizability of the system. In my experience in working with SQLite, I find that generally you want to design your schema so that writes are as short as possible -- the global DB write lock can be a big bottleneck, as you said, so the less time you spend holding the write lock, the better. If it's possible to split up data for different functionalities into different DBs, that might help improve performance by avoiding waiting for the global write lock on a single DB all the time. Now glancing over your schema, I wonder if it might make a difference if you used the implicit rowId for your 'ID' fields instead of strings. The rowId in SQLite is special, because it exists for every table implicitly, is always unique, and AFAIK allows fast lookups (or faster lookups than strings, AIUI). It may not be practical to do that now, given the large amount of data already stored with string IDs, but it could potentially make a difference. Of course, you may need to map it to strings somewhere, so I'm not sure if the tradeoff is worth it, but it might be instructive to experiment with it in an offline system to see if you could gain some performance that way. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RLbut basically, any column used in a WHERE clause is a candidate for indexing.Yep, I think we're past that already. The last issue I ran into was subscriptions. Some people seem to be creating subscriptions to collect and email them frequently, sometimes on every post - not that those work well, because the forum stops emailing people as soon as they have unread messages in their subscriptions, but they still get saved to the queue. Still, the longer the forum was online, the more subscriptions have accumulated, and every new post resulted in all those subscriptions getting triggered. Now, every time a subscription with an email action was triggered, we had to check if there are any unread messages in their subscription queue, and there can be a lot of messages in there - thus, this caused something like an O(m*n) database operation (with the underlying database implementation also not having a constant execution time of course). I fixed this by limiting the check to the first unread post instead of reusing a function to count all unread messages in the subscription queue:
Sep 26 2018
On Wednesday, 26 September 2018 at 02:33:27 UTC, Vladimir Panteleev wrote:fixed this by limiting the check to the first unread post instead of reusing a function to count all unread messages in the subscription queue: https://github.com/cybershadow/DFeed/commit/9cfcab2Seems like the issues with the forum got worse. It's hardly usable today, most of the time I am being greeted by "forums are being overloaded" message.
Oct 04 2018
On Thursday, 4 October 2018 at 19:18:15 UTC, JN wrote:Seems like the issues with the forum got worse. It's hardly usable today, most of the time I am being greeted by "forums are being overloaded" message.Yeah, painfully aware. I've been trying a bunch of different things all day, and looks like things are back under control now. I think the metal is reaching its limit, or there's just too much stuff running on the machine - it continued to be really slow even 10 minutes after I completely stopped the forum process.
Oct 04 2018
On Thu, Oct 04, 2018 at 11:51:02PM +0000, Vladimir Panteleev via Digitalmars-d wrote:On Thursday, 4 October 2018 at 19:18:15 UTC, JN wrote:Maybe it's time for the Foundation to fund dedicated hardware for running the forum software? If we're reaching the limit of the hardware, there's not much else that can be done. T -- Turning your clock 15 minutes ahead won't cure lateness---you're just making time go faster!Seems like the issues with the forum got worse. It's hardly usable today, most of the time I am being greeted by "forums are being overloaded" message.Yeah, painfully aware. I've been trying a bunch of different things all day, and looks like things are back under control now. I think the metal is reaching its limit, or there's just too much stuff running on the machine - it continued to be really slow even 10 minutes after I completely stopped the forum process.
Oct 05 2018
On Friday, 5 October 2018 at 16:11:05 UTC, H. S. Teoh wrote:On Thu, Oct 04, 2018 at 11:51:02PM +0000, Vladimir Panteleev via Digitalmars-d wrote:Why not use some Cloud Services instead of worrying about hardware? Al.On Thursday, 4 October 2018 at 19:18:15 UTC, JN wrote:Maybe it's time for the Foundation to fund dedicated hardware for running the forum software? If we're reaching the limit of the hardware, there's not much else that can be done. TSeems like the issues with the forum got worse. It's hardly usable today, most of the time I am being greeted by "forums are being overloaded" message.Yeah, painfully aware. I've been trying a bunch of different things all day, and looks like things are back under control now. I think the metal is reaching its limit, or there's just too much stuff running on the machine - it continued to be really slow even 10 minutes after I completely stopped the forum process.
Oct 05 2018
On Fri, Oct 05, 2018 at 07:35:11PM +0000, AlCaponeJr via Digitalmars-d wrote:On Friday, 5 October 2018 at 16:11:05 UTC, H. S. Teoh wrote:[...]On Thu, Oct 04, 2018 at 11:51:02PM +0000, Vladimir Panteleev via Digitalmars-d wrote:[...]Yeah, painfully aware. I've been trying a bunch of different things all day, and looks like things are back under control now. I think the metal is reaching its limit, or there's just too much stuff running on the machine - it continued to be really slow even 10 minutes after I completely stopped the forum process.Maybe it's time for the Foundation to fund dedicated hardware for running the forum software? If we're reaching the limit of the hardware, there's not much else that can be done.Why not use some Cloud Services instead of worrying about hardware?[...] Yes, and have it go down when things like the infamous AWS Outage happens. Centralization is evil. (But of course, that still beats the status quo of a dedicated server that's overloaded and resulting in poor service... so you may have a point. :-P) T -- Computerese Irregular Verb Conjugation: I have preferences. You have biases. He/She has prejudices. -- Gene Wirchenko
Oct 05 2018
On 2018-10-05 22:32, H. S. Teoh wrote:Yes, and have it go down when things like the infamous AWS Outage happens. Centralization is evil.And you think that cannot happen when you're managing the hardware yourself? If you're hardware is not failing you're ISP still can. If you're really paranoid you'll use different regions from your cloud provider and even use different cloud providers. That all cloud providers will go down at the same time is extremely unlikely. -- /Jacob Carlborg
Oct 06 2018
On 2018-10-05 22:32, H. S. Teoh wrote:Yes, and have it go down when things like the infamous AWS Outage happens. Centralization is evil.How is having single server (I assume) behind a single ISP any less centralization than a cloud provider? Another advantage of using the cloud is that it's much easier to make the foundation the owners (having all the accounts and passwords) then it would be for someone's private machine. It's currently a major issue and annoyance when there's only a single person that has access to a machine and that person is not available. Just see what has happened to the DMD nightly builds. It's been down for weeks (soon a month?) and it's only Martin that has access to the machine. If you're in the cloud and get a problem with one machine, just tear it down and rebuild it (there should obviously be a way to completely automatically build the machine). -- /Jacob Carlborg
Oct 06 2018
On Saturday, 6 October 2018 at 09:29:42 UTC, Jacob Carlborg wrote:Just see what has happened to the DMD nightly builds. It's been down for weeks (soon a month?)I've checked the logs from my stuff at Semaphore and it's since Sept 7 to be exactly.
Oct 06 2018
On Tuesday, 25 September 2018 at 21:20:29 UTC, Vladimir Panteleev wrote:Sometimes the database (SQLite)https://github.com/CyberShadow/DFeed/blob/master/schema.sql CREATE TABLE [Groups] ( [Group] VARCHAR(50) NULL, [ArtNum] INTEGER NULL, [ID] VARCHAR(50) NULL , Time INTEGER); If you're using SQLite you don't need to specify the size of the columns, for what I gather it's useless for this DB. You may declare GROUP VARCHAR(50), but if you insert something bigger, it will insert anyway, because the column adjust according the value. And if I'm not mistaken it usually preferable to use TEXT for strings. And should avoid NULL columns every time you can, because it impacts (Negatively) in the index.
Sep 25 2018
On Wednesday, 26 September 2018 at 02:28:27 UTC, CharlesM wrote:If you're using SQLite you don't need to specify the size of the columns, for what I gather it's useless for this DB.Yep, this is mostly descriptive. Types in column declarations have mostly the same effect.And if I'm not mistaken it usually preferable to use TEXT for strings.In SQLite? How so?And should avoid NULL columns every time you can, because it impacts (Negatively) in the index.The only thing I could find on NULL in index performance was this:For example, a partial index might omit entries for which the column being indexed is NULL. When used judiciously, partial indexes can result in smaller database files and improvements in both query and write performance.However, a lot of those NULL columns should never be null. I wonder why I declared them as such...
Sep 25 2018
On Wednesday, 26 September 2018 at 02:42:15 UTC, Vladimir Panteleev wrote:On Wednesday, 26 September 2018 at 02:28:27 UTC, CharlesM wrote:I don't remember where I read, but it's because the type affinity, in fact in the and it doesn't matter because VARCHAR will be TEXT in the end: https://www.sqlite.org/datatype3.html So at least you save space in your script. :) I'm a bit rusty with SQLite, but I was on a project a year ago, and I had this big table and trying with/without NULL, I got a better performance without it. By the way I found this: https://blog.paddlefish.net/?p=885If you're using SQLite you don't need to specify the size of the columns, for what I gather it's useless for this DB.Yep, this is mostly descriptive. Types in column declarations have mostly the same effect.And if I'm not mistaken it usually preferable to use TEXT for strings.In SQLite? How so?
Sep 25 2018
On Tuesday, 25 September 2018 at 21:20:29 UTC, Vladimir Panteleev wrote:Sometimes the database (SQLite)SQLite was designed initially to be single local process, one connection. You should get much better results with postgres though of course it has some maintenance overhead (mainly installation)
Sep 25 2018
On Tuesday, 25 September 2018 at 21:12:54 UTC, Steven Schveighoffer wrote:Well, I thought it might be GC related also. It behaves similarly to how you would expect a GC pause to behave (several fast responses, then one that takes 5 seconds to come back).I think that would be plausible if parts of the managed heap were swapped out to spinning-rust HDDs. Fortunately, RAM is plentiful in our case.
Sep 25 2018