tag:blogger.com,1999:blog-6141538634386315408.post3118987551958081190..comments2023-08-03T06:35:39.117-07:00Comments on Sandeep's Research Notes: RCFile and CIFSandeephttp://www.blogger.com/profile/06441110652923869738noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-6141538634386315408.post-8095268217038787672013-05-08T18:40:25.766-07:002013-05-08T18:40:25.766-07:00Thanks.Thanks.Deepakhttps://www.blogger.com/profile/13081356247953427228noreply@blogger.comtag:blogger.com,1999:blog-6141538634386315408.post-9268530138297096772013-05-08T08:38:07.811-07:002013-05-08T08:38:07.811-07:00Deepak,
The code that I wrote for the benchmarks ...Deepak,<br /><br />The code that I wrote for the benchmarks was at a previous job, and it is not available under an open-source license. As for RCFile APIs, I'd recommend pinging the Hive-dev or Hive-users mailing lists. For a start, you can look at how Hive uses its ObjectInspectors in conjunction with the InputFormat to get at the data in RCFiles. Good luck!Sandeephttps://www.blogger.com/profile/06441110652923869738noreply@blogger.comtag:blogger.com,1999:blog-6141538634386315408.post-37647136392484795572013-05-08T05:03:15.264-07:002013-05-08T05:03:15.264-07:00I was going through your paper where you have done...I was going through your paper where you have done benchmarking across TXT, SEQ and CIF. Could you please share that code (github project?). i am trying to do something similar with Parquet/RCFile and possibly CIF. We need to narrow down on a columnar storage format.Deepakhttps://www.blogger.com/profile/13081356247953427228noreply@blogger.comtag:blogger.com,1999:blog-6141538634386315408.post-90120181864149732662013-05-08T00:51:02.425-07:002013-05-08T00:51:02.425-07:00Sandeep,
I am trying to store plain text data onto...Sandeep,<br />I am trying to store plain text data onto HDFS in RCFile format. then would like to do benchmarking for reads across various columns. I am not able to find API documentation, JAR file that has implementation for RCFile. All i found was hive supports storage of data on HDFS in RC File format. i am looking at Input/OutputFormat APIs for RCFile. Please share. Deepakhttps://www.blogger.com/profile/13081356247953427228noreply@blogger.com