Omega Owners Forum
Chat Area => General Discussion Area => Topic started by: tunnie on 22 August 2011, 10:15:15
-
This is a long shot, but any Sybase experts?
I'm booting at database, but it fails to find certain parts. They have just 'disappeared' - They have not been deleted or moved:
It loads a section ok:
kernel Initializing virtual device 9, '/opt/sybdev/data/DEV_DB_011_DATA_08' with dsync 'on'.
00:00000:00001:2011/08/20 14:57:41.73 kernel Virtual device 9 started using asynchronous i/o.
But then about 7 parts it fails to find (because quite rightly, they are not there):
nitializing virtual device 13, '/opt/sybdev/data/DEV_DB_011_DATA_12' with dsync 'on'.
00:00000:00001:2011/08/20 14:57:41.73 kernel dopen: open '/opt/sybdev/data/DEV_DB_011_DATA_12', No such file or directory
00:00000:00001:2011/08/20 14:57:41.73 kernel udactivate: error starting virtual disk 13
It previous managed to drop DATA_08, I recovered that, booted, and DB worked, for about an hour, then died again.
Booting it, now suggests _12 through to _20 are missing :-?
Struggling to understand what would cause data from /opt/sybdev/data to go missing! :-/
-
Struggling to understand what would cause data from /opt/sybdev/data to go missing! :-/
fsck could, quite happily, have removed the files if the inode structure was corrupt..
Do you know if those are actual data files (in which case it's going to be hard to recover the DB!) or transactional log files?
Sadly I know squat about Sybase; can you ask me one on MySQL instead? ;D
-
The company Sybase includes many products - which one is it?
There are SIX listed on their site.
One of their products I am an expert in - this sounds like it isn't that one!
-
recover from the backup prior to your reboot :y
-
Struggling to understand what would cause data from /opt/sybdev/data to go missing! :-/
fsck could, quite happily, have removed the files if the inode structure was corrupt..
Do you know if those are actual data files (in which case it's going to be hard to recover the DB!) or transactional log files?
Sadly I know squat about Sybase; can you ask me one on MySQL instead? ;D
How can I tell thats the case? :-[
recover from the backup prior to your reboot :y
Thats what I'm doing now, I used to be in Operations at this company. Now I'm doing anything technical related, i am certainly learning a lot!
Previously I copied a chunk and re-named it to fool the DB into booting, once booted I did a recovery using 7 stripes from an earlier backup, however the DB failed again. I think something is corrupted there, so now resorting to a later backup.
I just don't understand why it is removing the files it is :-/
The company Sybase includes many products - which one is it?
There are SIX listed on their site.
One of their products I am an expert in - this sounds like it isn't that one!
Its Sybase ASE :y
-
Struggling to understand what would cause data from /opt/sybdev/data to go missing! :-/
fsck could, quite happily, have removed the files if the inode structure was corrupt..
Do you know if those are actual data files (in which case it's going to be hard to recover the DB!) or transactional log files?
Sadly I know squat about Sybase; can you ask me one on MySQL instead? ;D
How can I tell thats the case? :-[
There should be a log from fsck somewhere - location probably varies depending on Linux distribution (assuming it's Linux and not 'real' Unix).. /var/log/fsck?
If Sybase is removing those files then I'm guessing they're transactional logs - copying random ones around to new names is only going to screw the data up entirely, though..
-
Struggling to understand what would cause data from /opt/sybdev/data to go missing! :-/
fsck could, quite happily, have removed the files if the inode structure was corrupt..
Do you know if those are actual data files (in which case it's going to be hard to recover the DB!) or transactional log files?
Sadly I know squat about Sybase; can you ask me one on MySQL instead? ;D
How can I tell thats the case? :-[
recover from the backup prior to your reboot :y
Thats what I'm doing now, I used to be in Operations at this company. Now I'm doing anything technical related, i am certainly learning a lot!
Previously I copied a chunk and re-named it to fool the DB into booting, once booted I did a recovery using 7 stripes from an earlier backup, however the DB failed again. I think something is corrupted there, so now resorting to a later backup.
I just don't understand why it is removing the files it is :-/
The company Sybase includes many products - which one is it?
There are SIX listed on their site.
One of their products I am an expert in - this sounds like it isn't that one!
Its Sybase ASE :y
I would suggest trawling their web site and help files.
Not sure what there is for ASE but some of the servers have masses of help information.
BTW with ADS another of their products we have never had to do a full recovery but that uses discrete files.
All part of SAP now.
My thoughts are a full restore as well.
-
I'm hoping this server was backed up to takes, as I need to do a restore about a month back now i think.
It still refuses too boot, apart from some test errors the log does not really indicate anything.
http://dl.dropbox.com/u/803897/Work/PP/Log1.rtf
*ignore the license error, they are too cheap to spend the £50k for it...
-
Remind us of the history...
This database was happily running till you accidently turned off the box?
when was it last rebooted
how often and what type of backups are taken? full, hot backups?
in use files?
-
Remind us of the history...
This database was happily running till you accidently turned off the box?
when was it last rebooted
how often and what type of backups are taken? full, hot backups?
in use files?
It was bust long before I got my mits on it, I gets a call from my mate, said everything is down! They rebooted the server (not physically) Sadly we lost the logs from this...
It was decided we should bounce the box at the blade centre, the person I replaced said something similar had happend before. Hard reboot solved it.
Booting backup, it was missing _08, our thoughts it was corrupted/lost from the reboot.
But once we got it back, it was running for a short time, the DB failed again. Everytime I try & connect I get:
Client Message
Layer 6, Origin 8, Severity 5, Number 3
ct_connect(): directory service layer: internal directory control layer error: Requested server name not found.
Ahh yes backups, as this company is failing, the end is near. To save money as the contact run out at Rackspace, everything was moved in-house, number of key services were not longer available, all redundancy was lost, BladeCentre was no longer RAID5, each blade ran on its own HD.
I'm investigating backups now, but it does not look good :'(
Shame, I only needed to keep this POS system running until November :(
-
Last physical reboot last Wednesday, soft reboot, yesterday. I can see Daily/Weekly/Monthly backups to tapes which are then outsourced to Iron Mountian, whats actually on those tapes I've no idea ;D
I used to be the Ops Analyst here, but now I'm doing the DB Architect's job, Windows Sys Admin, Linux Sys Admin, Billing Analyst, Ops job again, all rolled into one ;D
-
Ohh dear.
straw clutching time then....
Ahh well, if they wont spend, what do they expect....
that message wont go down well with management though.
Is the 'possibility' logged and monitored via a risk log? if so at least its been highlighted, and reduces the trouble.
-
Ohh dear.
straw clutching time then....
Ahh well, if they wont spend, what do they expect....
that message wont go down well with management though.
Is the 'possibility' logged and monitored via a risk log? if so at least its been highlighted, and reduces the trouble.
Its such a shame, I joined this company fresh from Uni, it was 60 strong, going places. I had the chance of moving to New York for new product, then the Americans came and stuffed it all up.
They paid a considerable 7 figure some for the company, its now just 5 people left, 6 if you include me as part time.
I'm mates with my boss, so its all good for me.
To give you an idea of whats going on here, they want us (me) to drop the £15k SAN solution they bought 4 years ago, as part of cost cutting they want to use Dropbox as a General Share for all company documents ::) :o :-X
;D ;D ;D
-
Tunnie,
for one thing I'm sure you are having trouble at the unix level before sybase..
which may require TheBoy to be hands on job ;)
-
Tunnie,
for one thing I'm sure you are having trouble at the unix level before sybase..
which may require TheBoy to be hands on job ;)
When this all started failing, our first thoughts were hard drive failure. BladeCentre is 5 years old, the Sybase person who worked her before, also suggested that likely.
If no backups are around, we are fubared!
-
"00:00000:00001:2011/08/22 11:22:51.69 server Database 'master' is now online."
this means that sybase start up properly with master database (which is critical for itself)
still checking the logs....
-
"00:00000:00001:2011/08/22 11:22:51.69 server Database 'master' is now online."
this means that sybase start up properly with master database (which is critical for itself)
still checking the logs....
Thanks Cem :y
-
cant see initalization of disks DATA_10 and DATA_11 :-/
seems like they partitioned data to multiple disks which means trouble :-/
-
00:00000:00011:2011/08/22 11:22:52.83 server Database 'replicator_test' cannot be opened. An earlier attempt at recovery marked it 'suspect'. Check the SQL Server errorlog for information as to the cause.
yep.. you cant touch a suspect database directly from DB tools.. it must be stopped first of all..
and also sql server does the same when it cant reach files physically :-/
also
server Started UNDO pass for database 'billing'. The total number of log records to process is 6756.
00:00000:00011:2011/08/22 11:22:53.22 server Undo pass for database 'billing': 6000 records done (88%); 756 records left.00:00000:00011:2011/08/22 11:22:53.22 server Undo pass of recovery has processed 1 incomplete transactions.
this must have happened when its shut down..
-
00:00000:00011:2011/08/22 11:22:52.83 server Database 'replicator_test' cannot be opened. An earlier attempt at recovery marked it 'suspect'. Check the SQL Server errorlog for information as to the cause.
yep.. you cant touch a suspect database directly from DB tools.. it must be stopped first of all..
and also sql server does the same when it cant reach files physically :-/
also
server Started UNDO pass for database 'billing'. The total number of log records to process is 6756.
00:00000:00011:2011/08/22 11:22:53.22 server Undo pass for database 'billing': 6000 records done (88%); 756 records left.00:00000:00011:2011/08/22 11:22:53.22 server Undo pass of recovery has processed 1 incomplete transactions.
this must have happened when its shut down..
Thanks Cem, yet another pointer to HD, not a problem if they had kept it RAID. >:(
-
yep.. now reached the end of log..
according to logs Sybase working properly most of its databases brought online..
2 databases effected from problem
'replicator_test' is suspect which you cant reach and see the files also..
another one is the 'billing database'
Started REDO pass for database 'billing'.
The total number of log records to process is 13705
server Started UNDO pass for database 'billing'. The total number of log records to process is 6756.
00:00000:00011:2011/08/22 11:22:53.22 server Undo pass for database 'billing': 6000 records done (88%); 756 records left.
00:00000:00011:2011/08/22 11:22:53.22 server Undo pass of recovery has processed 1 incomplete transactions.
which means half of the records are written and committed and nearly half of the transactions are rolled back.. (so its not same as its last stage)
probably when op.system trying to recover disk/file system problems it erased some files of the replicator_test..
this is all I can say.. :-/
-
Thanks Cem, much appreciated :y :y
-
Thanks Cem, much appreciated :y :y
no probs.. I'd be happy if it helps a bit :y :y
-
Initial reaction, based on my MSSQL and MySQL limited knowledge, is possibly the underlying filesystem. If the DB removes files itself, I'd like to think it was clever enough not to try to load them. Not looked at the logs though.
Cem knows far more about RDBMS than I ever will, and he has cast his eyes over it.
I'm guessing if you had bought the licence, rather than run an illegal copy, you might have had some support from Sybase. Also guessing you don't have an active Redhat licence to cover the OS either ::)
-
Initial reaction, based on my MSSQL and MySQL limited knowledge, is possibly the underlying filesystem. If the DB removes files itself, I'd like to think it was clever enough not to try to load them. Not looked at the logs though.
Cem knows far more about RDBMS than I ever will, and he has cast his eyes over it.
I'm guessing if you had bought the licence, rather than run an illegal copy, you might have had some support from Sybase. Also guessing you don't have an active Redhat licence to cover the OS either ::)
agreed..
and thanks :y
-
Initial reaction, based on my MSSQL and MySQL limited knowledge, is possibly the underlying filesystem. If the DB removes files itself, I'd like to think it was clever enough not to try to load them. Not looked at the logs though.
Cem knows far more about RDBMS than I ever will, and he has cast his eyes over it.
I'm guessing if you had bought the licence, rather than run an illegal copy, you might have had some support from Sybase. Also guessing you don't have an active Redhat licence to cover the OS either ::)
Its no illegal, its legit, its just certain features are not available without the license, they lost various tools, one was the customer service tool. That was not deemed important enough!
Thanks everyone for their input :y
We keep coming back to same conclusion of file system / hardware, I'm looking at that now :) :y
-
Also RedHat is all legal too its all above board :)
-
Initial reaction, based on my MSSQL and MySQL limited knowledge, is possibly the underlying filesystem. If the DB removes files itself, I'd like to think it was clever enough not to try to load them. Not looked at the logs though.
Cem knows far more about RDBMS than I ever will, and he has cast his eyes over it.
I'm guessing if you had bought the licence, rather than run an illegal copy, you might have had some support from Sybase. Also guessing you don't have an active Redhat licence to cover the OS either ::)
Its no illegal, its legit, its just certain features are not available without the license, they lost various tools, one was the customer service tool. That was not deemed important enough!
Thanks everyone for their input :y
We keep coming back to same conclusion of file system / hardware, I'm looking at that now :) :y
Tunnie,
if you ask my opinion, although depending on the application,
it may be better for you to choose autocommit=yes=on whatever from DBMS configuration.. so in case of problems your losses will be very minimal..
good luck :y
-
cheers Cem :y
-
Also RedHat is all legal too its all above board :)
Hokey dokey, in which case you have a support contract with them, so you can use them to assist with tracking down if its a hardware faiilure :y