hive调优-大数据文档资料.docx
HivePerformance
TorontoHadoopUserGroupJuly232013
Presenter:
AdamMuise–Hortonworksamuise@
Page1DeepDivecontentbyHortonworks,Inc.islicensed
Page1
CreativeCommonsAttribution-ShareAlike3.0UnportedLicense.
Agenda
?Hive–WhatIsItGoodFor?
?Hive’sArchitectureandSQLCompatibility
?TurningHivePerformanceto11
?GetDataInandOutofHive
?HiveSecurity
?ProjectStinger–MakingHive100xFaster
?ConnectingtoHiveFromPopularTools
Page2DeepDivecontentbyHortonworks,Inc.islicensed
Page2
CreativeCommonsAttribution-ShareAlike3.0UnportedLicense.
Hive–SQLAnalyticsForAnyDataSize
StoreandQuery
StoreandQueryallDatainHive
andExis6ngSQLProcesses
WeblogOpera1onal/MPP
Weblog
MobileSensor
DeepDivecontentbyHortonworks,Inc.islicensedundera
CreativeCommonsAttribution-ShareAlike3.0UnportedLicense.
Page3
Hive’sFocus
?ScalableSQLprocessingoverdatainHadoop
?Scalesto100PB+
?StructuredandUnstructureddata
Page4DeepDivecontentbyHortonworks,Inc.islicensedundera
Page4
CreativeCommonsAttribution-ShareAlike3.0UnportedLicense.
ComparingHivewithRDBMS
Hive
RDBMS
SQLInterface.
SQLInterface.
Focusonanaly1cs.
Mayfocusononlineoranaly1cs.
Notransac1ons.
Transac1onsusuallysupported.
Par11onadds,norandomINSERTs.
In-Placeupdatesnotna1velysupported(butarepossible).
RandomINSERTandUPDATEsupported.
Distributedprocessingviamap/reduce.
Distributedprocessingvariesbyvendor(ifavailable).
Scalestohundredsofnodes.
Seldomscalebeyond20nodes.
Builtforcommodityhardware.
OQenbuiltonproprietaryhardware(especiallywhenscalingout).
Lowcostperpetabyte.
What’sapetabyte?
DeepDivecontentbyHortonworks,Inc.islicensedundera
CreativeCommonsAttribution-ShareAlike3.0UnportedLicense.
Page5
Agenda
?Hive–WhatIsItGoodFor?
?Hive’sArchitectureandSQLCompatibility
?TurningHivePerformanceto11
?GetDataIn